OPTIMAL  CONTROL  OF  THE  M/G/l  QUEUEING  SYSTEM  WITH  REMOVABLE 
SERVER-LINEAR  AND  NON-LINEAR  HOLDING  COST  FUNCTION 


fi  a"\ 
'Sj.V 

rH*i  ' 


o 


■vaaapi^ 


<£ 


BY 

PETER  ORKENYI 

TECHNICAL  REPORT  NO.  65 
AUGUST  1976 


PREPARED  UNDER  CONTRACT 
N00014-76-C-0418  (NR-047-061) 
FOR  THE  OFFICE  OF  NAVAL  RESEARCH 


Reproduction  in  Whole  or  in  Part  is  Permitted 
for  any  Purpose  of  the  United  States  Government 

This  document  has  been  approved  for  public  release  and  sale; 
its  distribution  is  unlimited 


/ 


DEPARTMENT  OF  OPERATIONS  RESEARCH  ^ 
STANFORD  UNIVERSITY 
STANFORD,  CALIFORNIA 


OPTIMAL  CONTROL  OF  THE  M/G/l  QUEUEING  SYSTEM  WITH  REMOVABLE 
SERVER-LINEAR  AND  NON-LINEAR  HOLDING  COST  FUNCTION 


by 


PETER  ORKENYI 


TECHNICAL  REPORT  NO. 
AUGUST  1976 


PREPARED  UNDER  CONTRACT 
N00014-76-C-0418  '//(NR-047-061) 
FOR  THE  OFFICE  OF  NAVAL  RESEARCH 


Frederick  S.  Hillier,  Project  Director 


Reproduction  in  Whole  or  in  Part  is  Permitted 
for  any  Purpose  of  the  United  States  Government 

This  document  has  been  approved  for  public  release 
and  sale;  its  distribution  is  unlimited. 


This  research  was  supported  in  part  by 
NATIONAL  SCIENCE  FOUNDATION  GRANT  ENG  75-14847 


DEPARTMENT  OF  OPERATIONS  RESEARCH 
STANFORD  UNIVERSITY 
STANFORD,  CALIFORNIA 


OPTIMAL  CONTROL  0?  THE  M/G/l 


QUEUEING  SYSTEM  WITH  REMOVABLE  SERVER  - LINEAR 


AND  NON-LINEAR  HOLDING  COST  FUNCTIGN 


Peter  Orkenyi 


In  recent  years,  various  queueing  control  problems  have  been  studied 
and  solved  by  a number  of  investigators.  A brief,  but  excellent  survey 
of  the  literature  on  this  can  be  found  in  Gross  and  Harris  (197*+,  pp.  364- 
371).  In  most  cases,  the  studies  have  concerned  a single  server.  In 
this  report,  we  consider  the  m/g/1  queueing  system  with  removable  server. 

In  Section  1,  the  problem  is  defined,  some  potential  applications 
are  outlined,  and  previous  studies  of  the  problem  are  reviewed.  The 
problem  is  then  formulated  as  a semi-Markov  decision  process  in  Section 
2.  In  Section  5,  the  case  of  linear  holding  cost  is  considered.  Finally, 
the  case  of  non-linear  holding  cost  is  considered  in  Section  4. 


1.  Introduction. 

The  m/g/1  queueing  system  with  removable  server  was  first  studied 
by  Yadin  and  Naor  (1963).  Their  idea  was  to  utilize  the  idle  time  of 
the  server  in  the  m/g/1  queueing  system,  since  this  time  can  be  sub- 
stantial. Therefore,  they  proposed  to  remove  the  server  when  the  system 
would  become  empty  (thus  leuting  the  server  perform  some  other  useful 
duty) , and  to  bring  him  back  when  the  number  of  customers  in  the  system 
would  reach  a certain  critical  number.  We  investigate  this  idea  by  con- 
sidering the  optimal  control  of  the  queueing  system. 


1 


R 

‘ • '--  -.  .. mwjbj)"-.u_."  1 1 iij mw»*-  ■» 

|| 
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At  each  point  in  time,  the  server  is  either  on  or  of?.  While  off,  j 

no  customers  are  being  served.  While  on,  the  customers  are  served  just  | 
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cedure  at  the  end  of  which  the  server  is  on.  Its  duration  is  the  start-  2 

up  time.  It  is  assumed  that  the  start-up  times  are  independent  random  m 
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variables  with  common  cumulative  distribution  function  G.  It  is  also  11 
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is  a service  cost  K incurred  each  time  a service  is  initiated.  Second,  1 
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start-up  cost  R.  and  a shut-down  cost  R„  is  incurred  upon  each  com*  a 

pier, ion  of  a start-up  and  a shut-down,  respectively.  Fourth,  there  is  jj 
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a holding  cost  for  holding  customers  in  the  system.  It  is  incurred  at  m 
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a rate  which  is  a non-negative,  non-decreasing  function  (h)  of  the  ,,  J 

number  o?  customers  in  the  system.  j 
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Some  comments  about  these  costs  are  appropriate  here.  The  service 
cost  K may  actua.lly  represent  the  expected  (discounted)  cost  for  giving 
service  to  a customer.  Likewise,  R^  and  Rg  may  actually  represent 
the  effect  of  costs  incurred  during  the  start-up  and  shut-down  times, 
respectively  (although  the  shut-down  times  are  not  considered  explicitly 
here,  they  are  not  excluded  by  our  formulation  of  the  problem).  We 
assume  that  r,  Rn  and  + Rg  are  non-negative. 

In  general  terms,  the  objective  is  to  find  a policy  for  turning 
the  server  on  and  off  such  that  expected  costs  are  minimised.  The  problem 
is  considered  both  with  and  without  the  use  of  discounting.  When  the 
costs  are  not  discounted,  two  optimality  criteria  are  used.  The  first 
one  is  the  average  cost  criterion,  according  to  which  a policy  is  optimal 
if  it  minimizes  the  long  run  expected  average  cost.  The  second  criterion 
is  the  undiscounted  cost  criterion.  A policy  is  optimal  for  this  cri- 
terion if  it  minimizes  the  long  run  expected  cost  where  a cost  incurred 
at  a rate  equal  to  the  minimum  long  run  expected  average  cost  is  sub- 
tracted from  the  original  costs.  When  the  costs  are  discounted,  the 
discounted  cost  criterion  is  used.  A policy  is  optimal  fc.r  this  criterion 
if  it  minimizes  the  total  expected  d:  recounted  cost. 

We  will  let  a denote  the  interest  rate,  N denote  the  set  of 
positive  integers,  denote  the  set  of  non-negative  integers,  and  R 

denote  the  set  of  real  numbers. 


1.2  Examples  of  Potential  Applications. 


Traffic  Control: 


Consider  a bridge  which  can  be  opened  and  closed  at  a cost  and 
r2,  respectively  (for  the  sake  of  simplicity,  we  assume  that  they  are 
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incurred  at  the  completion  of  the  operation?).  A ship  can  only  pass 
under  the  bridge  if  it  is  pen  1 ms,  If  a ship  arrives  and  the  bridge 
is  closed,  the  ship  must  wait  until  the  bridge  is  opened  before  it  car. 

pass  under  it.  There  is  a cost  for  keeping  the  ships  waiting,  and  it 

r 

is  incurred  at  e va te  h for  each  ship.  The  flow  of  traffic  on  tht 
bridge  is  stopped  when  the  bricge  is  opened,  and  it  only  resumes  when 
the  bridge  is  closed  again.  A cost  is  incurred  aba  rate  r when  tne 

traffic  or.  the  bridge  is  interrupt  >i.  The  problem  is  to  determine  when 

the  bridge  should  be  opened  and  closed. 

That  this  problem  can  be  viewed  as  an  M/g/1  queueing  system  with 
removable  server  can  be  seen  as  follows.  Let  the  ships  be  the  customers 
ana  let  the  bridge  be  the  ser’er.  The  service  time  is  the  time  it  takes 
for  a ship  to  pass  under  the  bridge  (we  assume  that  there  are  physical 
constraints,  so  that  only  one  ship  can  pass  under  the  bridge  at  a 
time).  The  start-up  time  is  the  time  it  takes  to  open  the  bridge. 
Clearly,  the  cost  structure  here  is  the  same  as  in  the  m/g/1  queueing 
system  under  consideration.  Just  let  K be  the  expected  (discounted) 
cost  for  halting  the  traffic  on  the  bridge  while  a ship  passes  under 
it,  and  let  R1  and  P.g  be  such  that  they  represent  the  direct  cost 
for  opening  and  closing  the  bridge  plus  the  cost  for  halting  the  traffic 
or.  the  bridge,  while  the  bridge  is  being  opened  and  closed,  respectively. 
Notice  that  the  holding  cos > function  is  linear. 


Computer  Time-Sharing  Control: 

Consider  a company  which  has  only  one  computer,  but  several  termi- 
nals. he  jobs  originating  from  the  terminals  are  the  on-li;^  jobs,  and 
the  T-obs  delivered  to  the  operating  room  are  the  off -line  jobs.  The 
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on-line  jobs  have  priority  over  the  off-line  jobs.  'In  fact,  there  is 
a cost  incurred  at  a rate  h for  each  on-line  job  which  is  kept  waiting, 
while  there  are  no  costs  associated  with  keeping  off-line  jobs  waiting. 
However,  a cost  is  incurred  at  a rate  r while  the  computer  does  not 
process  off-line  jobs.  If  the  computer  is  processing  an  off-line  job 
when  an  on-line  job  arrives,  the  off-line  job  may  be  thrown  out  of 
the  computer  so  that  it  can  start  processing  the  on-line  job.  When 
a job  is  thrown  out  of  the  computer,  its  entire  memory  content  is  trans- 
ferred to  an  auxiliary  memory  device  such  that  the  processing  of  the 
job  may  be  resumed  later.  If  an  off-line  job  is  always  thrown  out  of 
the  computer  when  an  on-line  job  arrives,  there  may  be  an  excessive 
shifting  of  data  from  the  computer  to  the  auxiliary  memory  device  and 
vice  versa.  It  may  therefore  be  desirable  to  wait  until  a number  of 
on-line  jobs  have  arrived  before  throwing  an  off-line  job  out  of  the 
computer.  The  problem  is  to  determine  when  an  off-line  job  should  be 
thrown  out  of  the  computer  (if  at  all). 

That  this  problem  can  be  viewed  as  an  M/ g/ 1 queueing  system  with 
removable  server,  can  be  seen  as  follows.  Let  the  on-line  jobs  be  the 
customers,  and  let  the  computer  be  the  server.  The  service  time  is  the 
time  it  takes  to  execute  an  on-line  job,  and  the  start-up  time  is  the 
time  it  take?  to  shift  the  memory  content  of  the  computer  to  an  auxiliary 
memory  device.  Clearly,  the  cost  structure  here  is  the  same  as  in  the 
M/g/1  queueing  system  under  consideration.  Just  let  K be  the  expected 
(discounted)  cost  for  not  using  the  computer  for  off-line  jobs  while  an 


on-line  job  is  being  processed,  and  let  R1  and  R2  be  such  that  they 
represent  the  cost  for  not  using  the  computer  for  off-line  jobs  while 


da :;a  is  shifted  from  the  computer  to  a;,  auxiliary  device  ar.d  vice  vrsa , 
respectively  Notice  that  the  holding  cost  function  is  linear. 


Production  Control; 

Considev  a manufacturing  company  which  uses  a high  efficiency  pro- 
duction line  for  the  production  of  items  of  a certain  type,  say  type  A. 
The  expected  (discounted)  cost  for  producing  an  item  of  type  A is  K-, . 
When  an  item  of  type  A is  completed,  a reward  K„  is  received.  In 
order  to  produce  an  item  of  type  A,  an  item  of  type  3 is  needed. 

Items  of  type  B arrive  to  the  production  line  according  to  a stationary 
Poisson  process.  There  is  a cost  for  holding  items  of  type  B in  the 
system,  and  it  is  incurred  at  a rate  -which  is  a non-decreasing,  non- 
negative function  of  the  number  of  items  of  type  3 present.  This 
cost  may  represent  the  costs  associated  with  storing  and  maintaining 
the  items.  When  there  are  r.o  items  of  type  B preset4 , there  are  two 
alternative  actions  available.  The  first  one  is  simply  tc  wait  for 
item.-,  of  type  B to  arrive.  The  second  one  is  tc  switch  the  production 
at  the  production  Ijne  to  the  production  of  items  of  type  c.  In  order 
to  do  this,  however,  one  has  to  set  up  the  production  line  for  the  pro- 
duction of  items  of  type  0.  Also,  once  the  production  line  is  set  up 
for  the  pro  notion  of  items  of  type  C,  it  has  to  be  set  up  for  the 
production  o.  ! terns  cf  type  A before  the  production  of  these  items  car. 
be  resumed.  There  is  a setup  cost  ass  ~ciat  1 with  each  setup.  T;  ore 
is  a.' so  a cost  for  not  producing  items  of  type  C-  This  cost  is  incurred 
at.  rate  r while  item  of  type  C are  not  produced.  The  problem 
:s  to  determine  when  the  production  at  the  production  line  should  be 
s-.ntched  from  the  production  of  one  type  of  items  to  the  production  of 


another  type  of  items  (if  at  ail) . 


II 


That  this  problem  can  be  viewed  as  an  m/g/1  queueing  system  with 
removable  server  can  be  seen  as  follows.  Let  the  items  of  type  B be 
the  customers,  and  let  the  production  line  be  the  server.  The  service 
time  is  the  time  it  takes  to  produce  an  item  of  type  A,  and  the  start-up 
time  is  the  time  it  takes  to  set  up  the  production  line  for  the  production 
of  items  of  type  A.  Clearly,  the  cost  structure  here  is  the  same  as 
in  the  m/g/1  queueing  .--ystem  under  consideration.  First  let  K repre- 
sent the  sum  of  the  service  .-ost,  the  product  completion  reward  and  the 
cost  for  not  producing  items  of  type  C when  an  item  of  type  A is 
produced.  Also,  let  R^  and  represent  the  setup  costs  plus  the 
cost  for  not  producing  items  of  type  C while  the  production  line  is 
being  set  up  for  the  production  of  items  of  type  A and  for  the  pro- 
duction of  items  of  type  C,  respectively.  Notice  that  the  holding 
cost  function  may  be  non-linear. 

1.3  Seme  Terminology. 

We  are  interested  in  showing  that  certain  simple  intuitive  types  of 
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policies  are  optimal.  These  policies  are  the  hysteretic  policies.  A 
policy  is  called  hysteretic  if  there  are  two  integers,  say  m and  n 
(n  < m),  such  that  the  server  is  always  turned  on  (or  kept  on)  when 
the  number  of  customers  in  the  system  is  greater  than  or  equal  to  m, 
and  such  that  he  is  always  turned  off  (or  kept  off)  when  the  number  of 
customers  in  the  system  is  less  than  or  equal  to  n.  This  policy  is 
denoted  by  T(n>m) . The  numbers  m and  n are  the  upper  and  lower 

Confer  with  Gebhard  (19 66). 


n 
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intervention  points,  respectively. 

If  the  lower  intervention  point  is  less  than  zero,  or  if  the  upper 
intervention  point  is  equal  to  plus  infinity,  then  the  hysteretic  policy 
is  degenerate.  Otherwise,  the  policy  is  non-degenerate.  Hysteretic 
policies  whose  upper  intervention  points  are  finite  and  lower  interven- 
tion points  are  less  than  one,  are  called  natural  hysteretic  policies. 
The  different  types  of  hysteretic  policies  are  pictured  in  Figure  1. 

The  aim  of  this  study  is  to  prove  that  there  always  exists  a hys- 
teretic policy  which  is  optimal,  and  to  give  the  conditions  for  when 
the  various  types  of  hysteretic  policies  are  optimal.  For  the  case 
v/here  the  holding  cost  function  is  linear,  especially  explicit  and  con- 
venient results  are  obtained. 


1.4  Previous  Studies  of  the  Problem. 

As  mentioned  before,  Yadin  and  Ifeor  (1965)  were  the  first  ones  to 
study  the  m/g/1  queueing  system  with  removable  server.  They  examined 
the  steady-state  behavior  of  the  system,  given  that  a natural  non- 
degenerate hysteretic  policy  is  used.  Using  a linear  holding  cost 
function,  they  found  the  value  of  the  upper  intervention  point  which 
minimizes  the  expected  cost  rate  in  steady-state. 

Heyman  (1968)  was  the  first  one  to  consider  the  optimal  control 
of  the  m/g/1  queueing  system  with  removable  server.  As  with  Yadin 
and  Naor,  he  assumed  a linear  holding  cost  function.  In  addition,  he 
assumed  that  the  start-up  times  were  zero-  He  considered  the  problem 
both  with  and  without  discounting,  and  proved  the  existence  of  a 


hysteretic  optimal  policy.  However,  his  proofs  were  incomplete. 
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Figure  1.  The  types  of  hysteretic  policies,  where 
the  x-axis  indicates  the  status  of  the 
server,  the  y-axis  indicates  the  number 
of  customers  in  the  system,  and  the 
arrows  indicate  how  the  system  moves. 


Sobel  (1969)  considered  the  Gl/o/l  queueing  system  with  removable 
server.  However,  he  used  a cost  structure  which,  it  seems,  voul'5  inly 
be  natural  if  the  GI/G/l  system  were  in  fact  an  M/c/l  system.  Ke 

* 

used  the  average  cost  criterion  Under  some  fairly  weak  conditions,  he 
proved  that  there  is  a non-degenerate  hysteretic  policy  which  is  optimal 
among  all  stationary  policies. 

Bell  (1971)  considered  the  same  problem  as  Heyman,  but  only  with 
discounting.  He  completed  Heymar.’s  proofs,  and  also  gave  an  efficient 
algorithm  for  finding  an  optimal  policy. 

Blackburn  (1971)  independently  obtained  results  similar  to  those  of 
Bell.  He  also  considered  the  more  general  case  where  the  holding  cost 
function  is  any  non-negative,  non-decreasing,  convex  function  with  a 
bounded  slope.  He  used  discounting,  and  under  certain  weak  conditions 
proved  that  there  is  a non-degenerate  hysteretic  policy.  However,  the 
present  author  has  found  that  his  proof  was  incomplete  at  one  point 
(namely  in  the  proof  of  Lemma  18,  Chapter  5).  Intuitively,  the  result 
seems  to  be  true,  so  it  is  still  hoped  that  the  proof  can  be  completed. 

Reed  (197**a)  also  considered  the  M/g/1  queueing  system  with 
removable  server.  He  used  a new  approach  to  the  problem  and  derived 
similar,  but  somewhat  more  explicit  results  than  those  of  Bell  and  Blackburn. 
Later,  Reed  (197^^)  extended  his  previous  results  to  cover  the  case  of 
Ron-instantaneous  start-up  and  shut-down  times. 

Recently,  Deb  (1976)  considered  the  M/g/1  queueing  system  with 
removable  server  (actually  he  considered  bulk  service,  but  by  letting 
the  bulk  size  be  equal  to  one,  his  problem  becomes  the  same  as  ours). 


He  allowed  a general  non-negative,  non-decreasing  holding  cost  function, 
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but  assumed  instantaneous  start-ups.  His  main  result  was  that  there 
exists  a natural  non-degenerate  hysteretic  policy  Which  is  optimal  if 
the  slope  of  the  holding  cost  function  is  bounded  below  by  a certain 
constant. 

Other  variants  of  the  m/g/1  queueing  system  have  been  considered 
by  Bell  (1975) , Blackburn  (1972),  Tijms  (1975)  and  Levy  and  Yechiall 
(1975).  In  particular,  Bell  considered  the  system  with  several  customer 
classes,  Blackburn  considered  the  system  with  balking  and  reneging,  ” 
Tijins  considered  the  system  where  the  service  time  of  a customer  becomes 
known  when  he  enters  the  system,  and  Levy  and  Yechiali  considered  the 
system  where  the  server  is  removed  for  a random  period  of  time  with  a 
given  distribution  function. 


2.  A Semi -Markov  Decision  Process  Formulation. 


■The  m/g/1  queueing  system  with  removable  server  can  be  formulated 
as  a semi-Markov  decision  process.  In  order  to  do  this,  a state  space ; 
an  action  space  for  each  state,  a law  of  motion  and  a cost  function  must 
be  specified.  We  first  identify  the  decision  epochs,  the  state  of  the 
system  and  the  set  of  permissible  actions. 

The  decision  epochs  are  the  epochs  when  customers  arrive  and  depart 
with  the  exception  of  those  arrivals  which  occur  while  the  server  is 
giving  service  to  another  customer.  At  each  decision  epoch,  the  state 
of  the  system  is  defined  as  the  pair  of  integers  indicating  the  number 
of  customers  in  the  system  and  the  status  of  the  server  (the  second 
integer  being  1 if  the  server  is  on  and  0 if  he  is  off).  Thus,  the 
state  space  becomes  x (0,1),  where  is  the  set  of  non-negative 

integers. 


At  each  decision  epoch  there  are  always  two  available  actions,  action 
0 and  action  1.  Action  0 is  to  turn  the  server  off  (or  keep  him  off 
if  already  off),  and  action  1 is  to  turn  him  on  (or  keep  him  on  if 
already  on).  Thus,  each  action  space  becomes  (o,l). 

The  law  of  motion  and  the  cost  function  are  in  principle  determined 
now.  In  order  to  avoid  any  ambiguities,  a formal  description  of  the  law 
of  motion  and  the  cost  function  is  included  below. 

The  law  of  motion,  q,  is  the  mapping  from  x (0,1)  x (0,1} 

X NQ  x (0,1)  x R into  R,  given  by 

II  , if  i=i5,  3=1,  k=3'=0,  t > 0, 


1 ~ e 


rt  . .i’-i+l 


, if 


i=0,  j=i '=k=l,  t > 0, 
i'=i+l,  j=3'=k=0,  t > 0, 


q(i,d,k,i’,d5,t)  M ^^(u),  if  i'  > i-l>  ’-to-l,  t > 0, 


IW 


e”  <lG(u)  , if  i’  > i,  3=0,  3’=k=l>  t > 0, 


, otherwise, 


for  i e Nq,  j e (0,1),  k e (0,1),  i:  e N , 3?  e (0,1}  and  t e R. 

The  cost  function  c is  a mapping  from  N^X  (o, 1}  x R into  R, 
given  by 


) 


if  k < j , t > 0, 


rt 

h(i)'t 


R1-F(t) 


c(i,j,k,t)  =< 


, if  i = 0,  j = k=  l,  t>0 

, if  j = k = 0,  t > 0, 

_t  / \i+h  . 

+ h(i+n)  f (i-F(u))  tj~tt  e"Xudu, 

neNQ  Jo  ' 

, if  i > 0,  j = k = 1,  t > 0, 


K + h(i+n)  f (l-F(u))  e“Xudu, 

neN0  Jo  U+n'* 

, if  j = 0,  k - 1,  t > 0, 

0 , otherwise, 

v.  ’ 


for  i e Nq>  j e Co,l) , k e {0,1}  and  t e R. 

The  interpretation  of  q and  c are  as  follows.  Consider  a decision 
epoch.  Suppose  that  the  state  of  the  system  at  that  decision  epoch  is 
(i,j)  and  that  the  action  taken  there  is  k.  Pick  a state  (i*,jY), 
and  a time  t.  Then  q.(i, j,k,if, j ‘ ,t)  is  just  the  joint  probability 
that  the  next  decision  epoch  occurs  within  a time  t and  that  the  state 
of  the  system  at  that  decision  epoch  is  (i*,j').  Furthermore,  ]c(i,j,k,t) 
is  just  the  expected  cost  accumulated  within  time  t after  the  first 
decision  epoch  considered  here.  We  now  introduce  some  general  notation 
to  be  used  later. 

Let  (P  and  06  denote  the  class  of  all  policies  and  the  class  of 
stationary,  deterministic  policies,  respectively.  For  each  7 r e 
let  cp^  denote  the  long-run  expected  average  cost  (per  unit  of  time), 
given  that  the  policy  TT  is  used  (the  start-state  is  irrelevent  in  this 


case).  For  each  ir  e £) , i e NQ  and  j e (0,1),  let  w^i,  j ) denote 

the  long-run  expected  cost  in  excess  of  what  is  indicated  by  cp  , given 

7T 

that  the  start-state  is  (i,j)  and  that  the  policy  7T  is  used.  Finally, 
for  each  tt  e (? , i e NQ  and  j e (0,1),  let  v (i,j)  denote  the  total 
expected  discounted  cost  given  that  the  start-state  is  (i,j)  and  that 
the  policy  ir  is  used. 

A policy  tt  is  average  optimal  if,  it  minimizes  cp^_  (tt  e 06),  and 
it  is  undiscounted  optimal  if  it  minimizes  w^(i,j)  for  each 
(i,j)  e Nq  x tO,l]  among  all  average  optimal  policies  (inc^),  A policy 
is  discounted  optimal  if  it  minimizes  for  each  (i,j)  e X 

(0,1)  (TT  e (?). 

In  order  to  be  able  to  determine  whether  a given  policy  is  optimal 
or  not,  we  will  need  some  optimality  conditions.  Fortunately,  the  problem 
without  discounting  can  be  solved  quite  directly,  so  we  need  only  consider 
the  problem  with  discounting  here. 

Optimality  conditions  for  semi-Markov  decision  processes  were  given 
by  Orkenyi  (1976).  The  important  concepts  of  improvable  and  unimprovable 
policies  were  introduced  there. 

A policy  is  improvable  if  there  is  a start-state  such  that  the 
expected  discounted  cost,  given  that  start-state  (and  the  policy  under 
consideration),  can  be  reduced  by  changing  the  first  action  chosen  by 
the  policy.  A policy  is  unimprovable  if  it  is  not  improvable. 

More  formally,  for  each  tt  e $ , let  r)  denote  the  set  of 
(deterministic)  policies  which  uses  the  same  decision  rule  as  7 r after 
the  first  decision  epoch,  A policy  tt*  in  then  is  unimprovable  if 


< v (i,j),  for  i e N , 0 6 (0,1),  7T  e ^)(tt*)  . 
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Orkenyi  (19?o)  also  showed  that  if  a policy  i*  in  oL)  is  improvable, 
then  there  is  a policy  ?r  e which  is  an  improvement  over  T \ More 
specifically,  let  T*  be  a policy  ir.  j5(ir*)  such  that 


v < v *(i,o)/  for  i e N , j e Co,l}  , 

IT 

with  a strict  inequality  for  some  state  (i,j).  let  7T  be  the  policy 
in  such  that  it  uses  the  same  decision  rule  as  tt*  does  at  the 

first  decision  epoch.  -Then  a theorem  by  Orkenyi  (1976)  says  that 

l\t(i>3)}  for  i e NQ,  j e {0,1}  , 

and  7T  is  an  improvement  over  ir  . This  theorem  is  referred  to  as  the 
policy  improvement  theorem. 

Clearly,  an  optimal  policy  must  be  unimprovable.  But  an  unimprovable 
policy  need  not  always  be  optimal.  Conditions  ensuring  that  an  unim- 
provable policy  is  optimal  are  given  by  Orkenyi  (1976).  Chapter  ^ there 
contains  a discussion  of  the  optimality  of  unimprovable  policies  for  the 
M/g/1  queueing  system  with  removable  server. 

It  is  convenient  to  introduce  the  following  general  notation  here. 

For  any  random  variable  T,  g{f)  denotes  the  expected  value  of  T 
given  the  policy  ir  and  start-state  s. 


3.  The  Case  of  Linear  Holding  Cost  Function. 

In  this  section  we  consider  the  case  where  the  holding  cost  function 
is  linear.  This  case  has  been  studied  extensively  before.  Reed  (I97^a), 
(197^-b)  has  given  a characterization  of  the  optimal  policies.  Bell  (1Q71) 
and  Blackburn  (1971)  have  given  algorithms  for  finding  an  optimal  policy. 
Here,  some  new  and  stronger  results  are  presented.  The  emphasis  is  on 


V, 


a i *« «'  1 1,'  i 'j  ii  i,  tf  >'!•  "W; ! 


obtaining  results  which  aic  explicit  and  easy  to  use.  The  problem  is 
considered  both  with  and  without  discounting. 


3.1  The  Uudiscounted  Case- 

The  problem  is  somewhat  easier  without  discounting,  so  this  case 
is  considered  first.  The  optimality  criterion  is  the  undiscounted  cri* 
terion.  We  begin  by  obtaining  some  preliminary  results. 


3.1.1  Preliminaries. 

Recall  that  \ is  the  arrival  rate,  u is  the  service  .ate  and 
o (=  x/(i)  is  the  load  on  the  system,  let  £,  r|  and  7 be  defined 


£ = / tdG(t)  , 


T)  = I t dG(t)  , 


7 * / t dF(t)  . 


In  words,  £ is  the  expected  start-up  time,  q is  the  second  moment 
of  the  start-up  time,  and  7 is  the  second  moment  of  the  service  time. 
We  assume  that  these  quantities  are  finite. 

Let  T denote  the  time  until  the  state  (0,i)  is  reached,  and 


define  K and  V by 


K = E7T(-l,0);(l,l)(-}  ' 


V = ETT(-1,0),(0,0)(T}  * 


By  conditioning  on  the  time  until  the  second  decision  epoch  and  the  state 
of  tne  system  at  that  epoch,  we  obtain 


K-i  + 
11 

D 

fL 

£ 

H 

ieNrt. 

Jo 

MM*  , 


and 


v = t 


+ % f iT 

ieNQJo 


Mil  e~Xt  • i/C- 


i/C*dG(t) 


= t,  + \t,K  . 


This  implies  that 


K = 


(i-X  jl  ’ 1-p  ’ 


and 


IE 


V = t8?  - £ 


1 

1-p  * 


From  this,  we  obtain 


E7r(-l,m),(0,0)fTj 


^r(0,m),(0,0)*T* 


= 2*  + V + m K 

A. 


■rs^  + E)>  f°r  m£ 


N. 


0 * 


Let  H denote  the  holding  cost  incurred  until  the  state  (0,1) 
is  reached.  Letting  h denote  the  holding  cost  rate  for  each  customer, 
then  for  each  i e N^, 


IT 


n . H.M.I  ft1!1'  t lAi  t.i5  S .Tihr*V.i>  H,Fh  Jlifylfesjl'r.H  AM"!, I ut!  in  ( ’.it!;.!  'I'l'i.iMlI't  I»!r«1ftlt>  1*1  if  II  |}.  n>jh  th  11  j flj 


E7r(-l,G),(i;0){H) 


2 (E7T(-i,c;,(l,0)iH)  ‘•^-1)Kh) 


0=J 


" Etf( -1,0)  , ( 1 .0)  ' + 2 1(1_3-)Kh  • 

By  conditioning  on  the  time  until  the  second  decision  epoch  and  the 
state  of  the  system  at  that  epoch,  we  obtain 

V-l,0),(l,0}tH!  * * jE  l e-Xt(l-F(t))-(i+l)hdt 


and 


E f “ 

ietijo 


Mi 


e'“(i-\(o,:.),(i,o)CH)  - 1 i(±-x)Kl,}ap(t) 


= o-Vo,i),d,o)(Ki  +(HlS)h 


V-i,o),(o,o)(H)  S [ ^e-^d-ottn-i-hat 

ieiVv.ro 


+ Iitre'"t!i'Vo,l))(l,0){Hl  + \ i(i-l)«i)do(t) 

= + 2 ' ’ h • 


This  implies  that 


\7 


and 


r(-l,0),(0,0)(il3  ' (1-P  C + 2 ^Jp)2  t ~^2)h  * 
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From  this,  we  obtain 


E7r(-l,in),(0,0)fH)  “ ETT(0,m),(0,0)(H} 


, m-x 


+ fy-i.oMo.o)00  +"-»-v 


1 , .vh 

2 m(m-l)- 


+ (— 0_  . f + i _A_Z 

%1“p  2 (1-p)2 


£ + — ^-p)h  + r^r  * a*h 

2 (1-p)2  X"p 


+ -r  ^-h  + m(5-i^  + i7^5)h 

1 / 1 h / «*  1 1 \ h 

2 m^m-l)  * 1^  * \ + + il  + 2 l^l^ 

+ (p’^ + 1 H £ + 1 fcr  m e N0  • 


Let  C denote  the  cost  incurred  until  the  state  (0,1)  is 
reached.  Then,  for  each  m e NQ, 

VM)  = x(1-p),^(r^  + ^)hl 

= r(l-p)  + (p  +|^L)h  , 
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V'J 


w(0}m) j (QfO; 


icai 


zir&i'h  *Vo,»),(o,o)(H)) 


* K\  + R2)x(1-p)  + (|  m(m-l)  + m(\£  + p + 


**«  ♦§£?«  • 


.1.2  The  optimal  Value  of  m in  Tr(-l.m). 


We  will  now  consider  the  optimal  value  of  m in  Tr(-l,m). 
m denote  this  value.  That  m exists  is  shown  in  Section  4.1. 
given  by 


wir(-i,m’)(°'0)  -W7r(-l,m)(°'0)'  for  “ « *0 


W7r(-i,m)^0,°^  " E;r(-l>m),(0,0){ci  ” vir(-l,o)  ’ E7r(-l,m),(0,0) 

+ VW0'1*'  for  rasN0' 


\(-l,m+i)'°'0)  ' \(-l,m)(0/0) 

= E7r(-l,m+l;,(0,0)vC}  “ E7r(-l,r4(0,0) lC5 

“ tFT(-l,0»(E7r(-l,m+l),(0,0)(T)  “ ETT(-l,m),(0,0){T) 
m h . ,1.1  W,  h 


M 


Thus 


- (r(l-p)  + (p  + g • j^)h)  • 


1 h 


= l^p  \(m  + ^ " h^1_p^>  for  m e N0 


m'  « min{m  e N^lm  > ^(1-p)  - x£}  • 


J.1.3  The  Optimal  Value  of  m in  7r(0,m). 

We  will  now  find  the  optimal  value  of  m in  7r(0>m).  Let  m" 
denote  this  value.  Then  m"  is  given  by 


^TrCOjm")  - 'Vcc^m)* 


for  m e NQ  . 


Now 


V(0,m+1)  " <Pir(0,m)  = (m+l+\5Km+XE)  ' t(m+^)‘Eir(0,m+l),(0,0)tH) 

- (’■+1^t)-V(o,m),(o,o)tH)  " (R1  ♦V5 

= - 1(jHX£)(5  m(m+l)  +(m+l) (*£+P+  \ 

2 

- (m+l+\C)(|  m(m-l)  + m(\£  + P + \ ^)) 

- (td-pK*!  ♦ *g)  + M<  + §£?» 

= (lUtXEIluftE)  ■ l2  “ + ^ 1+2x5 

+ p + 2 1-p  ’ P " 2 1-p  " 2 1-p  t' 

- ~{l-p)(R1  + r2)} 

h flf  , * > * 1/^  ^ 1\2 

= * Mra  + ^ + ?)  - §(*£  + v 


ippppilWK 


+ - %l-p)(*3.  +B2» 


{(m+^+|)2  + m2  - X£ 


- k - r?  - ♦ "a” 


((m  + x£  + i)2  + (x£  - i)2  - y-3 


■y~(  1~p)  (I’j^  + R2)  - *;},  for  m e NQ  . 


Thus,  we  obtain 


m"  = min 


{meN0|m  > - \ .^+J2^A. 


(Rl+Rg)  + -(*£  - |)2  + |) 


3.1*4  Characterization  of  the  Optimal  Policies. 

It  is  proven  in  Section  4.1  for  the  general  case  of  non-decreasing, 
convex  holding  cost  function  that  either  a policy  Tr(-l,m)(m  < «•)  or 
a policy  7r(0,m)(m  < «>)  is  undiscounted  optimal,  depending  on  which 
is  average  optimal.  In  principle  therefore,  all  one  has  to  do  to  find 
the  (an)  optimal  policy  is  to  compute  m"  (by  using  the  formulae  in 
Section  3«1.3)>  compute  the  long  run  expected  average  cost,  given  that 
the  policy  ir(0,m")  is  used  (by  using  the  formulae  in  Section  3«l.l)> 
and  compare  it  with  the  long  run  expected  average  cost,  given  that  the 
policy  7r(-l,m’)  is  used. 


3.2  The  Discounted  Case. 

Here,  we  use  the  discounted  cost  criterion.  The  analysis  becomes 
somewhat  different  from  that  in  the  preceding  section.  One  reason  is 
that  it  is  possible  to  reformulate  the  problem  so  that  the  holding  costs 
do  not  need  to  be  considered  explicitly. 


3.2.]  Elimination  of  the  Holding  Costs  from  the  Analysis. 

Bell  (1971)  suggested  a reformulation  of  the  original  problem  such 
that  the  holding  costs  would  become  bounded.  Here,  we  show  that  the 
original  problem  can  be  reformulated  in  such  a way  that  the  holding  costs 
are  eliminated  from  the  analysis  altogether. 

Since  the  holding  cost  function  is  linear,  the  total  expected  dis- 
counted holding  cost  is  equal  to  the  sum  of  the  expected  discounted 
holding  cost  for  the  respective  customers.  Let  as  before  h denote 
the  individual  holding  cost  rate.  For  each  m ;e  N,  let  tg  and  i^n+l 
denote  the  times  when  the  n customer  arrives  and  departs,  respectively. 
Then  the  total  expected  discounted  holding  cost  is 


E{  £ h f 
neN 


-at,., 

e dtj 


c h / "a^'2n  "^^n+lx, 

= E(  ~(e  “ e )} 

neN 

= E{  S 5 e-0*2”)  - B(  D | 


Since  the  arrival  process  is  not  affected  by  the  policy  in  use,  the  first 
term  in  the  above  expression  is  neither.  Therefore,  it  may  be  neglected 
when  searching  for  an  optimal  policy.  The  second  term. does  depend  on  the 
policy  in  use,  and  therefore  cannot  be  neglected. 

Suppose  now  that  at  each  service  completion  a reward  h/a  would 
be  received.  Clearly,  the  expected  discounted  cost  arising  from  the 
service  completion  rewards  would  just  be  equal  to  the  second  term  in 


the  above  expression.  Therefore,  the  original  problem  must  be  equivalent 
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to  the  problem  in  which  a reward  h/a  is  received  at  each  service  com- 
pletion instead  of  incurring  a holding  cost  at  a rate  h for  each  customer 
in  the  system.  Thus,  since  the  service  completion  reward  may  be  included 
in  the  service  cost,  we  may  assume  without  loss  of  generality  that  there 
are  no  holding  costs.  This  will  now  be  done. 


5.2.2  Preliminaries. 

Let  a,  oj  and  | be  given  by 


\-ta  ' 


(a  e"°^dF(t)  , 

I - e^dGCt)  . 


In  words,  0 is  the  Laplace  transform  of  the  inter-arrival  times,  o> 
is  the  Laplace  transform  of  the  service  times  and  i is  the  Laplace 
transform  of  the  start-up  times. 

Let  as  before  T denote  the  time  until  the  state  (0,1)  is  reached, 
and  define  if  and  X by 


* • V-i,o),(i,i)(e"OT)  ’ 


x“  V-i.o),(o,o)(e"aT)  • 


By  conditioning  on  the  time  until  the  second  decision  epoch  and  the 
state  of  the  system  at  that  epoch,  we  obtain 
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■°°  (Xt)1  -(a+x)t  t i> 


r-dF(t) 


= £ f ^r-e^a+K 

= f e-(“+^*)aF(t)  , 

Jo 


„ £ f e-<a+X)t  • +i-4G(t) 

ieK.Jb  1' 

.f  ,-<“^-«)do(t)  . 


Since  p < 1,  is  the  unique  solution  of  the  above  equation  in  the 
interval  [0,1].  This  can  be  seen  as  follows. 

Let  g be  the  function  from  [0,1]  into  R,  given  by 


g(x)  = x -J  e“(0;+^“^x^dF(t) , for  x e [O,!1  . 


Taking  the  derivative,  we  obtain 


!'(*)  - 1 > 

00 

> 1 - X f tdj(t) 

Jo 


>0,  for  x e (o,l)  . 


= e"(°t+X)tdP(t)  < 0 , 


g(l)  = 1 - f e"°aF(t)  > 0 . 
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Therefore,  by  the  mean  value  theorem,  the  equation 


g(x)  = 0 


has  a unique  solution  in  the  closed  interval  [0,1]. 

We  will  now  consider  the  costs.  It  is  useful  to  introduce  the 
following  quantities.  Let 


and 


K 

1-03 


> 


K 

1-03  ' 


°®2  “ r 

c ■ 


K 

1-03  ’ 


Also,  let  Z denote  the  total  discounted  cost  incurred  until  \r.e  state 
(0,1)  is  reached.  Then,  for  each  i e NQ, 


F f7l  - 2dL 

ETT(-i,o),(i,i)lzj  - l^r 


K 


By  conditioning  on  the  time  until  the  second  decision  epoch  and  the 
state  of  the  system  at  that  epoch,  we  obtain 


E7r(-l,m),  (m,0)  ^ 


-at,!-*1-"1 

e 


K)dG(t) 


= U - X*m)  • for  m € NQ  . 


Using  this,  we  obtain 


m,  K , „ .,,mx  x 

0 > + »1> 

+ (»♦)"’  ’ x(xfe  + 11 lOO-olT1 


A/  K 


+ h)  * (ot)-  • X(^.^K)(l-at)- 


-Bi am  + D ~ * X(adm,  for  m e NQ  , 


V7r(0,m)(0^0) 


= am(-i|5(^XJlfm)  + iRx  + XA2)(1-X(crt)m)‘ 


» ara(-|B  + AX/)(l-X(a^)m)‘ 


« -A  + . for  m e N . 

1-X(a\lr)ra  0 


5.2.3  The  Optimal  Value  of  m in  7r(-l,m). 
From  the  preceding  section,  we  obtain 


V-i.»+i)(0’0) ' V-i,»)<0’0) 

= iB(l-<J)om  - D df  x(l-ot)(°t)' 


= ( 1-0 ) crm(  £B-DXi|fm)  . 


Suppose  first  that  D < 0.  Then  the  sign  of 


Tir(-1,»+1)<°’0)  - Tir<-1,»)(0’0) 

cannot  change  from  being  negative  to  being  positive,  so  the  optimal 
value  of  m in  ir(-l,m)  must  be  either  0 or  Now 


V7r(-1,0)  " "B^  + l-ot  XD  > 
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and 


(<n|f)  (l-tn|r)|B 
(1-X(a^)m)  (l-X(cn|/)m+1) 


tll±_  *“m  + v 0ffm  - 

^1-af  V * 1-cn jf  |B  ’ 


for  m e N . 


Let  f be  the  mapping  from  NQ  into  R^  given  by 

f<“>  - !3  + Xo  13  for  ” ' Ho  • 


Vo,rt)(0'01  - vir(o,m)(°-o) 


(IMMrM  , tor  MI,. 


Notice  that  f is  an  increasing  function,  since 


f(m)  - f(m-l)  = - Xa  ^-(i-aja"1-1 


= “ Xam),  for  m e N 


Suppose  first  that  B < 0.  Then  the  sign  of 


V 0,m+l)<0’0)  ' V(0,m)(0’0) 

cannot  change  from  negative  to  positive,  as  m increases,  so  the  optimal 
value  of  m in  ir(0,m)  is  either  0 or  00 . Now 

’V(o,o)(o’0)  = T3T  ’ 


v-r(o,»)(0,o)  - 0 


Therefore,  the  optimal  value  of  m in  7r(o,m)  is  determined  by  the 
sign  of 


XA  - IB  . 


Suppose  now  that  B > 0.  This  implies  that  A > 0,  and 


v7r(0,m+l)(0'0)  “ V7r(0,m)(0’0) 

changes  sign  from  negative  to  positive  exactly  once  as  m is  increased. 
Therefore,  the  optimal  value  of  m in  7r(0,m)  is 


m ^ min{m  e NQ|f(m)  > ~}  . 


3.2.5  Characterization  of  the  Optimal  Policies. 

We  now  will  show  that  a hysteretic  policy  is  optimal,  and  specify 
when  the  different  types  of  hysteretic  policies  are  optimal.  Since  we 
have  a semi-Markov  decision  process  with  bounded  costs,  an  unimprovable 
policy  is  always  optimal.  Therefore,  we  will  prove  that  a policy  is 
optimal  by  proving  that  it  is  unimprovable. 


Lemma  1:  If  A < 0,  then  Tr(°°,w)  is  optimal. 


Proof:  We  have  to  show  that 


for  1 G N0>  6 to, D,  r e o^(7r(">00))  • 

Consider  the  states  in  which  the  server  is  off.  If  a policy 
TT  e Air(-  ,») ) starts  with  turning  the  server  on  when  the  start-state 
is  (i,0),  then 

v^.(i,o)  = |(RX  + Rg)  . 
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Since  + Rg  > 0,  we  conclude  that 


(ijO)  < v.  (i,o),  for  i e L,  F s ^(tt(»,«>))  . 


Consider  the  states  in  which  the  server  is  on.  If  a policy 
tt  e o5{‘fr(00>“) ) starts  with  keeping  the  server  on  when  the  start-state 
is  (i,l),  then 


V0'1’  ■ xTS  * 0’e2  ’ 


v^(i,l)  = K + cuRg,  for  i e N . 


Since  r > 0 and  A < 0,  we  conclude  that 


VTT(«,«)(i,:!L)  - f°r  1 £ N>  T e ^(7r(“^“))  • 


Thus,  7r(«>,°°)  is  unimprovable  and  optimal. 


Q,*  E*  D • 


Lemma  2:  If  A > 0,  B < 0 and  C < 0,  then  Tr(o,°°)  is  optimal. 


Proof:  We  only  have  to  show  that 

V7r(0,«)(i^)  for  1 e NQ,  j e (0,1),  ir  e <3EJ(tt(o,*))  . 

Consider  the  states  in  which  the  server  is  off.  Let  7T  e <^(ir(0,")) 
be  the  same  policy  as  7r(0,ro)  except  that  it  turns  the  server  on  when 
the  start-state  is  (i,o).  Suppose  that 


V1'01  < vr(0,»)(i'0)  • 


By  the  policy  improvement  theorem, 


Vo,i)(i’0)  -\r(i’o) 
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This  implies  that 


vF(o,i)(i'o)  < V(o,»)(1’o)  • 

But  from  Section  3.2.4, 

since  B < 0.  This  is  a contradiction.  Therefore, 

Vvr(0,~)(i,°)  - Vi,0)j  f°r  1 c N0'  r € ^M0'00^  * 

Consider  the  states  in  which  the  server  is  on  and  the  system  is 
not  empty.  If  a policy  7 r e ^(7T(0,“))  starts  with  turning  the  server 
off  when  the  start-state  is  (i,l),  then 


vT(i,l)  = R2  • 


Now 


Vo.-'/1'1’  = iS-  K + • s2 


Clearly  A > 0 implies  that 


V1'3-)  i'v(0,")(i’1)  • 

Consider  the  state  (0,1).  If  a policy  7T  e ^(tt(0,“))  starts  with 


keeping  the  server  on,  then 


vT(0,l)  - ^ + CT‘v7r(o,0°)(1,l) 

= + «(M  * + «o)  • 


Therefore, 


V0>1)  - Vo,”)'0’11  = - K2>  - t3"0^ 


= -a(l-<|r)C 


> 0 . 


Thus,  tt( 0,°°)  is  unimprovable  and.  optimal. 


Q.E.  D. 


Lemma  3 : If  A > 0,  B < 0 and  C > 0,  then  ir(-l,“)  is  optimal. 


Proof:  We  only  have  to  show  that 

VTT(0,»)(i^)  for  1 e N0>  j 6 ir  e ^(7T( -!,»))  . 

Consider  the  states  in  which  the  server  is  off.  Let  v e <^(ir(-l,“)) 
be  the  same  policy  as  tt(-1,«>)  except  that  it  turns  the  server  on  when 
the  start-state  is  (i,0).  Suppose  that 

V(i’0)  \(-i,-)(1,0)  • 


V(-l,i)(i’0)  * V7T(i'0)  > 


\(-l,i)(i’0)  < vlr(-l,»)(i)l 


From  Section  3.2.3, 


VT(-l,l)tl,0)  - V-l,")(i’0)  ’ 


55 


so 


V1-1'  > • 


Thus,  Tr(-1,°°)  is  unimprovable  and  optimal. 


Q.E.D. 


Theorem  4:  if  B > 0,  then  a natural  hysteretic  policy  is  optimal. 

Proof:  Let  m’  be  the  (an)  optimal  value  of  m in  7r(0,m).  From 

Section  5.2,  we  know  that  rn:*  is  finite.  Suppose  first  that  TT^m*) 
is  at  least  as  good  as  7r(-l,m)  for  all  m e Nq.  Then  7r(0,m’)  is 

optimal.  To  show  this,  we  only  need  to  show  that 

v.q  m,)(i>0)  < vF(i,j),  for  i e NQ,  j e (0,1],  7T  e ^(TrCo^1))  • 

Consider  the  state  (0,1).  Let  7 r e o0(7r(O,m’))  be  the  same  policy 

as  7r(o,m’)  except  that  it  keeps  the  server  on  in  state  (0,1).  Suppose 


V0,!)  < V7r(0,m')(0,l)  * 


By  the  policy  improvement  theorem. 


vir(-l,m')(0’1)  - V0’1*  ' 


But  we  just,  assumed  that  this  is  not  the  case,  so  we  conclude  that 

VTr(0,m‘)(O,;L)  - V0'1),  for  ^ e • 
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i11!,1  i I | 


vir(0,m,)^i,°^  - v7r(i>°)>  for  1 > m,  7T  g ^(Tr(0,m»))  , 

Consider  the  states  in  which  the  server  is  on  and  there  is  at  least 
one  customer  present.  Let  ir  e c^(7r(0,m* ) ) be  the  same  policy  as 
7r(0,m')  except  that  it  turns  off  the  server  in  a state  (i,l),  0 < i < m* . 


Suppose  that 


V1'1’  < 'V(o,„')(i'1) 


By  the  policy  improvement  theorem, 


v /.  0(i,l)  < v (i,l) 

7r(i,m’r  “ T r 


VTr(i,m’)(i,0)  < v7r(0,m*)(i,0) 


vir(i,«’)(i'0)  “ Vo,m-i)(°'0) 

- V7r(0,m')(0,0)  ' 


which  implies  that 


V(O,m')(°'0)  < VTr(0,m’)(i>0) 


This  is  equivalent  to 


oi  * v7r(0,m»)(0’0)  < V7r(o,m’)(°'0)  > 


or 


V(0,m')(0,0)  > 0 • 

This  is  a contradiction,  since 

v A 0,0)  = 0 . 

tt(0  ,«)' 

Thus 

v7r(0,ra‘)('',l)  - V1*1*'  for  0 < 1 £ m'»  it  c $0r(O,~))  • 

Let  7T  e ^(TT(  0,  m'))  be  the  same  policy  as  Tr(o,m’ ) except  that 
it  turns  off  the  server  in  a state  (i,l),  i > ra*.  Suppose  that 

V1*1’ * Vo,«,)(l'1)  • 

By  the  policy  improvement  theorem, 

- ViA)  ' 

so 

Vi,i){l,1)  < vir(0,m *)(i,1)  * 

Repeating  the  argument  n times,  we  obtain 

V-Mi/"1’11  < Vo,.’)1"1’11  • 

Taking  the  limit  on  both  sides  as  n tends  to  infinity,  we  obtain 

vir(0,0)(0A)  *!hd  * 

But 
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XR2  + R, 


(Q  1) L.  + ill  B , g 

r(o,o)'0,i;  ~ l-o)  i-x  B l-x 


Since  B > 0,  R„  + R0  > 0 and  R,  > 0,  we  have  a contradiction. 

-!_  d — O. 


V7T(0>m‘)(i>1)  - Vi,;L^  f0r  1 > m'J  T fe‘  <^(7T(0>m’))  • 

We  conclude  that  7r(0,m')  is  unimprovable  and  optimal. 

Now  let  m*  be  the  optimal  m in  7r(-l,m).  From  Section  3-2. 3> 
we  know  that  m*  is  finite.  Suppose  that  7r(-l,m*)  is  at  least  as 
good  as  7r(0,m)  for  m e N^.  Then  7r(-l,m’)  is  unimprovable.  This 
is  shown  in  exactly  the  same  way  as  the  proof  that  TT^Ojm*)  was  unim- 
provable, so  it  will  not  be  repeated.  Thus  7r(-l>m’)  is  optimal. 

We  conclude  that  a natural  hysteretic  policy  is  optimal. 


lemma  3 : Suppose  B > 0,  and  let  m*  and  m"  be  the  optimal  values 
of  m in  7r(-l,m)  and  ir( 0,m),  respectively.  If  m’  < m",  then 
7r(-l,m’)  is  optimal.  If  m"  < m’,  then  ir(0,ffi")  is  optimal. 


Proof:  Suppose  first  that  7r(-l,m’)  is  optimal.  Then 


This  implies  that 


which  in  turn  implies  that  m*  < m". 

Suppose  now  that  7r(0,m")  is  optimal.  Then 
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This  implies  that 


m - 


/ r.t.  \ ^ m 

X(ct^)  a 


g(m)  , 


l-X(CTt) 

and  the  lemma  follows. 

Theorem  7 : Let  T denote  the  (an)  optimal  policy.  If  B < 0,  then 


for 

A < 0 , 

* 

7T  = < 

7T(0,«)  , 

for 

A >0,  C < 

0 , 

for 

A > 0,  C > 

0 . 

If  B > 0,  then 


* 

f7T(- 1 

,mt), 

for 

m*  < m"  , 

IT  = < 

(r(o,m")  , 

for 

m"  < m*  , 

ir(-l,m), 

for 

g(m) 

*z 

A 1 

for  m = m‘ 

ir(0,m)  , 

for 

g(“) 

<°J 

where 


n*  = min{m  e K0im  > log(|-)/log  ty) 


m"  = minfm  e NQ|f(m)  > ~)  , 

«■)  - & *'m  + + ’ 

«<■>>  - XD*“  + S?  - !® 


Proof;  These  results  follow  directly  from  Lemmas  1,  2 and  Theorem  4, 
and  Lemmas  5 and  6. 


Theorem  8:  Suppose  B > 0,  and  let  m‘  and  m"  be  defined  as  in  the 

preceding  theorem.  Then 


m'  _ m" 


,i  B\log  0/ log  t(-\  C 
^ D;  \>)&  • 


Proof:  By  definition, 


l*  = min{m  e NQ  | |)  , 


m"  = min{m  e N|f(m)  > - ^ . 

U *“  § D 

Since  both  -V™  and  f(m)  are  increasing  in  m. 


m’  Ur 


is  equivalent  to 


f^l0g^X  D^l0g  ^\>J  £ B * 

Simple  algebra  shows  that  this  is  equivalent  to 

B^  log  a/ log  C 

\>Tb‘ 


3.2.6  Bounds  and  Approximations. 

Based  on  the  results  in  the  preceding  sections,  it  should  be  easy 
to  find  the  optimal  policy  now.  The  only  problem  may  be  to  find  m" 
(given  in  Theorem  7).  Since  f(m)  is  an  increasing  function  of  m, 
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in"  may  be  found  efficiently  by  the  bisection  method  (see  Wilde  (197^ > 
pp.  300-^00)).  To  use  this  method,  one  needs  an  upper  bound  on  m". 
This  upper  bound  should  be  as  small  as  possible.  We  will  now  give  an 
upper  bound  which  is  also  a goo : approximation  to  m"  when  certain 
conditions  are  met. 

Notice  that 


f(m)  > tE?7  ’ v"'1’  for  m e K0  > 


and  that  the  expression  on  the  right-hand  side  of  the  above  inequality 
is  also  an  increasing  function  of  m.  Therefore 

m < min{m  e * > j 5} 

= min(m  e NQ|m  > log(^jp  | f)A°S  ^3  • 


Letting 


13  - log(iS+ 1 f)/l0B  ♦ > 


we  obtain 


m!'  < min(m  e N^|m  > b)  . 


For  finding  an  optimal  policy,  it  is  more  useful  to  have  a relatively 
tight  upper  bound  on  rr.in(m',m")  instead.  We  obtain 


min(m,,m'')  < 


» t Tj 

minfm  e N |m  > log(^  p/log  ty)  , 
min{m  e N^jm  > b] 


rain(m',m")  < min(m  e N |m  > min(b,log(|-  ^)/log  t)3  . 


Now 


b = log^X  |^log  ♦ + log(^jF  f)/log  * > 


so 


b < log(|  |)/log  \[f 


if  and  only  if 


l-o  D 
1-oty  A 


> 1 


This  is  equivalent  to  C < 0.  Therefore,  b is  a better  upper  bound 
on  min(m,,m")  than 

log(|  5) /log  ♦ 


if  and  only  if  C < 0. 

The  fact  that  b may  also  be  a good  approximation  for  m"  follows 
from  the  next  theorem. 

Theorem  9:  If 

1 - X(o\jf)b  > 0 , 

then  m"  is  either  the  smallest  non-negative  integer  above  b or  the 
largest  non-negative  integer  below  b. 

Proof : We  only  have  to  show  that 

f(b-D  < n . 
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N 


■ 

if 

5 

A 

A 

i 


Therefore 


f(b-l)  = f(b)  - (f(b)  - f(b-'l)) 

= f(b)  - (l-a)(^b  - Xab) 

< *0>)  - lEl f • X0b+1 * 


1 - X(cnfr)  > a . 


f(b-l)  < f (b) 


— — Xab+1 
l-aijf 


Corollary  10:  If 


X A 

t B * 


A > i l-o  , X vlog  */log(oM0 
B X 1— cn|r  'l-a;  * 


Q.  E.  D. 


then  m"  is  equal  to  the  smallest  non-negative  integer  above  b or  the 
largest  non-negative  integer  below  b. 


Proof:  Straightforward  algebra  shows  that  the  condition  of  the  Corollary 

is  equivalent  to  the  condition  of  the  theorem. 

Suppose  that  one  has  found  b,  and  that  it  does  not  seem  to  be 
a good  approximation  to  m".  Some  graphs,  indicating  the  true  value 
of  m"  as  a function  of  X,  'If.  a and  b,  have  been  developed  for  this 
case.  They  can  be  found  in  Appendix  B. 


3.2.7  The  Case  of  Erlangian  Service  and  Start-Up  Times. 


The  Laplace  transforms  a,  i,  and  X may  not  always  be  easy 


to  compute,  given  the  cumulative  distribution  functions  F and  G. 


If  the  service  times  have  k-Erlang  distribution,  then 


/ kw  \k 
^ ~ ^kp-ta  * 


t = ( Si )k  = 

vkp-ta+\-\r 


k+i-  * 

P a 


Since  it  is  impossible  to  derive  a closed  form  expression  for  \|f,  some 


graphs,  giving  \|f  as  a function  of  K,  p and  a,  have  been  developed. 


They  can  be  found  in  Appendix  C. 


If  the  start-up  times  have  a k-Erlang  distribution,  and  if  p* 


denotes  the  start-up  "rate,"  then 


t ~ ' 'k 


v _ / kp*  xk 

lkpr-KX+\-\iif;  * 


Having  computed  the  values  of  co,  \|r,  5 and  X,  the  optimal  policy  is 


easy  to  find. 


4,  The  Case  of  Non-Decreasing  Holding  Cost  Function. 


The  case  where  the  holding  cost  function  is  an  arbitrary  non- 


decreasing function  now  will  be  investigated.  Blackburn  (1971)  and 


Deb  (1976)  have  also  considered  the  problem  where  the  holding  cost 
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function  is  not  necessarily  linear.  The  problem  is  considered  both  with 
and  without  discounting. 

4.1  The  Undiscounted  Case. 

In  this  section,  costs  are  not  discounted.  Two  optimality  criteria 
are  used,  namely  the  average  cost  criterion  and  she  undiscounted  cost 
criterion.  These  criteria  were  described  in  Section  2.  Recall  that 
is  the  set  of  deterministic  stationary  policies.  Only  these  policies 
are  considered  here.  Let  $ denote  the  set  of  deterministic  stationary 
policies  which  always  turn  the  server  on  (or  keep  him  on)  at  decision 
epochs  where  the  number  of  customers  in  the  system  is  larger  than  a certain 
number.  It  will  be  shown  that  only  policies  in  need  to  be  considered. 

We  assume  that  the  service  times  are  not  instantaneous  and  that 
the  holding  cost  function  is  not  bounded  from  above.  If  desired,  the 
analysis  which  follows  can  be  extended  so  that  these  assumptions  become 
unnecessary.  Without  loss  of  generality,  we  only  allow  policies  which 
do  not  turn  the  server  on  and  off  repeatedly  at  the  same  point  in  time. 

For  each  7T  e and  (i,j)  e X (0,1),  let  qy(i, j)  denote 

the  long  run  expected  average  cost,  given  that  the  start-state  is  (i,j) 
and  that  the  policy  T is  used. 


th 


Lemma  11:  For  each  v e £j,  there  is  a tt*  e such  that 

< ^(ijj),  for  (i,j)  e SQ  x (0,1}  . 

Proof : Let  $ denote  the  set  of  deterministic  stationary  policies  which 

turn  the  server  on  if  he  is  off  and  there  are  more  than  a certain  number 
of  customers  in  the  system.  Clearly 


■$/  c~:^  *-?  ‘ 


We  prove  the  lemma  by  first  showing  that  for  each  T e , there  is 
a 7T*  e §5  such  that 

< ^(1,0),  for  (i,j)  e X {0,1}  , 

and  then  showing  that  for  each  w e ^ , there  is  a T1  e ^ such  that 
the  above  inequality  holds  again. 

Therefore,  consider  a policy  it  in  ol)  , but  not  in  . Then 
there  is  a number,  say  k,  such  that  7T  does  not  turn  the  server  on  if 
he  is  off  and  there  are  k or  more  customers  in  the  system.  This  implies 
that 

cp  (i,0)  = cp ■ (j,0),  for  k < i < j , 

It  It  — — 

since  the  expected  cost  incurred  until  a state  (j,o)  (j  > i)  is  reached, 
given  that  the  start-state  is  (i,0)  (i  > k)  and  that  the  policy  TT 
is  used,  is  finite. 

Since  h is  a non-decreasing  function,  and  since  the  number  of 
customers  in  the  system  is  always  j or  more,  given  that  the  start-state 
is  j (j  > k)  and  that  the  policy  ir  is  used, 

9jj.( j>0)  > h(j),  for  j > k . 

Together  with  the  result  above,  this  implies  that 

^(ijO)  =“,  for  i > k , 

since  h is  not  bounded  from  above. 


hQ 


Let  7T*  be  the  same  policy  as  7 r except  that  it  turns  the  server 


on  if  he  is  off  and  there  are  k or  more  customers  in  the  system. 
Clearly 


f<  (i>3)  = 00 , for  o = 0,  i > k , 
eP7J_( i ^ 0 ) , otherwise  . 


This  completes  the  first  part  of  the  proof. 

Now,  consider  a policy  ir  in  ^ , but  not  in  . Then  there 
is  a strictly  increasing  sequence  of  integers,  ^i^keN'  suc^  ^at  T 
turns  the  server  off  at  the  decision  epochs  where  he  is  on  and  the  number 
of  customers  in  the  system  is  i^  for  some  k in  N.  Since  the  service 
times  are  not  instantaneous,  the  probability  that  the  number  of  customers 
in  the  system  will  eventually  exceed  any  given  number  is  one.  This 
implies  that  the  long  run  expected  average  holding  cost,  given  any  start- 
state  and  the  policy  ir,  is  equal  to  plus  infinity,  since  for  each 
k c N the  number  of  customers  in  the  system  cannot  decrease  below  i^ 
once  it  has  been  exceeded.  Since  the  long  run  expected  average  cost 
due  to  other  costs  than  the  holding  cost  is  always  larger  than  minus 
infinity,  we  must  nave 


= 00 > for  (i,o)  e NQ  X (0,1}  . 
any  IT*  e $)  satisfies 

f^ijd)  <Qr(i,o)>  for  (i,j)  e NQ  x (0,l) 


This  completes  the  second  part  of  the  proof. 


ft.E.D. 


Lemma  12;  For  each  7T  e is  constant  over  (i,j)  e x (0,1). 

proof:  Assume  that  a policy  in  , say  7T,  is  used,  and  let  n be 

a number  such  that  the  server  is  always  burned  on  (or  kept  on)  when  there 
are  n or  more  customers  in  the  system.  Since  the  service  times  are  not 
instantaneous,  the  probability  that  the  number  of  customers  in  the  system 
will  eventually  exceed  n is  one. 

There  are  two  mutually  excluding  and  exhaustive  cases,  namely  the 
case  when  the  expected  holding  cost  incurred  during  a service,  given 
any  number  of  customers  in  the  system  at  the  start  of  the  service,  is 
finite  and  the  case  when  it  is  infinite.  In  the  latter  case,  the  long 
run  expected  average  cost,  is  equal  to  plus  infinity  for  all  start-states, 
and  the  lemma  holds. 

In  the  former  case,  the  expected  holding  cost  incurred  during  a 
service  initiated  with  n or  less  number  of  customers  in  the  system  is 
bounded.  Since  the  expected  number  of  services  given  before  the  number 
of  customers  in  the  system  exceeds  n is  bounded  from  above,  this  implies 
that  the  expected  holding  cost  incurred  until  such  a time  is  finite. 

Clearly,  the  expected  service  cost  incurred  until  the  number  of  customers 
in  the  system  exceeds  n is  also  finite. 

The  expected  switching  cost  incurred  until  the  number  of  customers 
in  the  system  exceeds  n can  be  seen  to  be  finite  as  follows.  There 
are  two  possible  cases,  the  case  where  the  start-up  times  are  instantaneous 
and  the  case  where  these  times  are  non-instantaneous.  In  the  former 
case,  the  expected  switching  costs  incurred  until  the  number  of  customers 
in  the  system  exceeds  n is  finite,  since  7 r cannot  turn  the  server 
on  and  off  repetitively  at  the  same  point  in  time  and  since  the  expected 


M 
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number  of  service?  given  before  the  r.'s.ber  of  customers  in  the  system 


exceeds  n is  finite.  In  the  latter  case,  the  expected  number  of 


turning  the  server  on  and  off  before  the  number  of  customers  in  the  system 


exceeds  n is  bounded.  Therefore,  the  expected  switching  cost  incurred 


until  the  number  of  customers  in  the  system  exceeds  n is  finite. 


Together  with  the  previous  results,  this  implies  that  the  expected  total 


cost  incurred  until  the  number  of  customers  exceed  n is  finite. 


This  in  turn  implies  that  ®^( i , j ) is  constant  over 


(i,j)  (.  Nq  x (0,1),  since  the  states  (i,o)  are  positive  recurrent 


for  i > n (recall  that  p < 1). 


Q.  E.  D* 


Since  we  will  only  consider  policies  in  jg]  hereafter,  we  will  drop 


the  reference  to  the  start-state  in  the  following. 


Theorem  1$:  There  exists  a natural  hysteretic  policy  which  is  average 


optimal. 


Froof : Let  T be  a policy  ir.  $ , and  let  n be  least  number  such 


that  T keeps  the  server  on  if  he  is  on  and  there  are  more  than  n 


customers  in  the  system.  Let  m be  the  least  number  greater  than  or 


equal  to  n such  that  T turns  the  server  on  when  the  state  of  the 


system  is  (m,0).  Then  the  policies  T and  Tf(n,m)  have  the  same 


positive  recurrent  class  (of  states)  and  they  take  identical  actions 


within  that  class.  Using  Lemma  12,  <9  = CP  , 

V ~(:r.,n) 


■'or  each  i c N,  lot  x.  denote  the  long  run  expected  proportion 


; f time  when  there  are  i customers  in  the  system  given  that  the  policy 


~;C,m-r.)  is  used.  Suppose  that  n > 0.  Then  the  long  run  expected 


average  holding  cost  is 


2)  x.h(i)  , for  T(0,m-n)  , 
ieN  1 

and 

x.h(i->-n).  for  ir(n,m)  . 

UN  1 

The  long  run  expected  average  cost  due  to  other  costs  than  the  holding 
costs  are  the  same  for  7r(0,m-n)  and  7r(n,m).  Since  h is  a non- 
aecreasing  function  which  is  not  bounded  from  above,  and  since  each  x^ 
is  strictly  positive, 


^TT^m-n)  < ^7^,111) 

Thus,  we  can  restrict  our  search  for  an  optimal  policy  to  the  class 
of  natural  hysteretic  policies.  In  order  to  prove  that  there  is  an 
average  optimal  natural  hysteretic  policy,  we  only  need  to  show  that 
there  is  a finite  k such  that 

® n N < cp  - for  m > k . 

7T(- 1,0)  - ir(0,m)’  - 

This  will  now  be  shown. 

For  each  i and  m in  N,  let  U m denote  the  long  run  expected 
proportion  of  time  when  there  are  i or  more  customers  in  the  system, 
given  that  the  policy  ir( 0,m)  is  used.  Since  > 0, 

<P  \ > * h(i)  + k'min(K,0),  for  i e N,  m e N . 

iT\  o^rn  / — iyf?i 

Choose  ieN  such  that 

^(-1,0)  < + x'min(K»°)  * 


i 


1 


52 


It  can  easily  be  snown  that  there  is  a k such  that 


i 


I. 


t.  > (^(.i^o)  “ k*min(K,o))/h(i),  for  m>k, 

since  the  right-hand  side  of  the  inequality  is  less  than  one.  This 
implies  that 


%r(-l,o)  - %r(o,m) 


for  m > n 


Q.E.D. 


We  now  introduce  some  convenient  terminology.  If  cp  is  the  optimal 
long  run  expected  average  cost,  then  the  relative  cost  incurred  during 
a given  time  interval  is  the  total  cost  incurred  then  minus  cp  times 
the  length  of  the  time  interval. 

For  each  i e N^,  let  denote  the  cost  incurred  until  the 

state  (i,l)  is  reached,  and  let  f be  the  function  from  into  R, 

given  by 


f(m)  E7r(-l,m+l),  (m,0)  ^Cm^  ETr(-i,m) , (m,o)  ^Cm^ 


for  m e NQ  . 


This  function  will  play  an  important  role  in  the  following. 


Lemma  lk;  if  f is  a non-decreasing  function  and  cp  denotes  the  optimal 
long  run  expected  average  cost,  then  the  expected  relative  cost  incurred 
until  the  server  is  on  (regardless  of  the  start-state)  is  minimized  by 
7r(-l,m)  (or  equivalently  by  7r(o,m)),  where 

m = min{m  e NQ|f(m)  > cp/\(l-p)}  . 


WMml 


. in  l i.h,.i  in  i.  u . I i;i(i 


S' 


l 


Proof:  If  the  system  starts  with  the  server  on,  the  lemma  is  trivial. 

Therefore,  assume  now  that  the  start-state  is  (i,0)  for  some  i e 

With  regard  to  the  expected  relative  cost  incurred  until  the  surver  is 

on,  any  policy  which  turns  the  server  on  eventually  is  equivalent  to 

a policy  7r(-l,m)  (or  Tr(0,m))  for  some  m. 

For  each  i e N , let  T.  denote  the  time  elapsed  until  the  state 
O x 

(i,l)  is  reached.  We  only  have  to  show  that 


V(-:L,i+l),(i,of  Ci 


""V  - ET(-l,i),(x,0)(Ci  - Wl1 


is  non-negative  for  i > m and  non-positive  for  i < m. 
just  equivalent  to 


since 


f(i)[ 


> <p/\(i-p), 

< <p/\(l-p). 


for  i > m* 
for  i < m*  , 


But  this  is 


A 

tl 


E7T(-l,i+l),(i,0){Ti}  " E7T(-l,i),(i,0)(Ti}  = \(l-p)  ‘ 

Since  f is  a non-decreasing  function,  the  lemma  follows. 

Q. E. D. 

Lemma_15:  For  any  set  of  real  numbers  a,  b,  c and  d such  that 

b > 0 and  d > 0, 


a ^ c ^ a a+c  a+c  . c 

b - d " b-b+d  bTd  - d ’ 

Lemma  16:  If  f is  a non-decreasing  function,  then  the  (an)  average 

optimal  value  of  m in  7r(0,m)  is  given  by 

m = min(i  e NQ|f(i)  > cP7r(0;i)(^(1_P)5  • 

5^ 





Proof:  By  renewal  theory, 


Vo,i)  = l!ir{o,-),(o,o!,ColtV/fVo,.),(o,o)llo])>  for  is"o 


Since 


^7r(o,i+l) , (0,0)  ^C(P  E7r(0,i),(0,0)% 


(CA)  + f(i),  for  i e NQ  , 


and  since 


Ef(0,i+1),(0,0)^T0^  £'F(0,i)^T0'!  E7T(0,i),(0,0){Tl^  + x(l-p)  ’ 


for  i e N 


'o  ' 


we  obtain 


\(0,i+l)  (ETr(0,i),(0,0)[C0^  + R2  + f(i))/tS^(o,i),(0,0)^V  + \(l-~p)^  * 


for  i € Nq  • 


Using  Lemma  15, 


\(0,i)  - ^(Oji+l) 


if  and  only  if 


<P. 


V(0,i)  i 


By  Theorem  15,  there  is  an  i such  that 


Cp7r(0,i)  - ^(0,1+1)  ’ 


so  m exists.  By  Lemma  15  and  the  definition  of  m, 


V,-)  - Vo,»*i)  - x(1-c)f(m) 
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Theorem  17;  If  f is  non-decreasing,  then  there  exists  a natural  hys- 
teretic  policy  which  is  undiscounted  optimal,  and  the  optimal  value  of 
the  upper  intervention  point  (m)  is  given  by 


m = min{i  e N |f(i)  > cp/x.(  1-p) 3 , 


where  <p  is  the  minimum  long  run  expected  average  cost. 


Proof:  Consider  the  case  where 


Vl.O)  < Vo,!)’  f0r  1 e N0  ' 


In  this  case,  only  policies  which  eventually  turn  the  server  on  and  never 
turn  him  off  are  average  optimal.  By  Lemma  14,  the  policy  ir(-l,m) 
minimizes  the  long  rur.  expected  relative  cost  for  each  start-state. 

This  implies  that  Tr(-l,m)  is  undiscounted  optimal. 

Consider  the  case  where 


Vo,.)  < min(V-l,0)’  Vo,m-l)>  Vo,rt)’  ' 

From  the  proof  of  Theorem  13,  only  natural  hysteretic  policies  can  be 
average  optimal.  From  the  proof  of  Lemma  1 6,  we  have 


Vo,.)  * Vo,!)’  f0r  1 £ K0  • 


This  implies  that  only  policies  which  take  the  same  actions  as  Tr(0,m) 
for  the  states  which  are  positive  recurrent  under  7r(0,m)  can  be  average 
optimal.  Since  7r(0,m)  minimizes  the  expected  relative  cost  until  the 
server  is  on  (for  each  start-state)  by  Lemma  14,  7r(0,m)  is  undiscounted 

optimal. 


57 


If  cp.,  ,=cp.  . or  cp,  v=cp,  for  some  i m,  then 

1 vir(-l,m)  Tr(o,m)  T7r(o,m)  <r(0,i) 

a finer  analysis  is  needed  to  determine  which  of  the  corresponding  policies 
is  undiscounted  optimal.  The  fact  that  one  of  the  policies  above  is 
undiscounted  optimal  follows  from  Lemma  14.  This  completes  the  proof. 

Q.E.D. 


Corollary  l8:  If  the  start-up  times  are  zero,  or  if  the  holding  cost 
function  is  convex,  then  there  is  a natural  hysteretic  policy  which  is 
undiscounted  optimal,  and  the  optimal  value  of  the  upper  intervention 
point  is  given  by  Theorem  17. 


Proof:  We  only  have  to  show  that  f is  a non-decreasing  function. 

Consider  the  ca:.e  where  the  start-up  times  are  zero.  In  this  case 


f(i)'E7r(0,l*l),(i,0)tci)-Rl'  for  15  V 

Clearly,  f is  non-decreasing,  since  h is  non-decreasing. 

Consider  the  case  where  the  holding  cost  function  is  convex.  Now 


f(i)  = E7r(0,i+l),(i,o)(Ci5  " E7r(0,i),(i,0)(Ci5 
= X + E7T(0,i+l),(i+l,l)(Ci5 

+ E7T(0,i+l),(i+l,0)^Ci+l^  " E7T(0,i),(i,0)^Ci^  f°r  1 e N0  ‘ 

The  two  first  terms  in  the  final  right-hand  side  are  non-decreasing  in  i, 
since  h is  non-decreasing.  The  difference  between  the  two  last  terms 
is  non-decreasing  in  i,  since  h is  convex.  Thus,  f is  non-decreasing. 

Q.  E.  D« 
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4.2  The  Discounted  Case. 


The  problem  with  discounting  now  will  be  considered.  We  assume 
that  the  start-up  times  are  instantaneous.  As  before,  let  §b  denote 
the  set  of  deterministic  stationary  policies  and  let  denote  the 
set  of  deterministic  stationary  policies  which  always  turn  the  server 
on  (or  keep  him  on)  at  decision  epochs  where  the  number  of  customers 
in  the  system  is  larger  than  a certain  number. 

Without  loss  of  generality,  we  nse  the  convention  that  the  server 
cannot  be  turned  on  immediately  after  he  is  turned  off.  The  results 
obtained  by  Orkenyi  (197 6,  Chapter  4)  then  are  applicable-.  In  parti- 
cular, any  unimprovable  policy  in  £)  is  optimal.  Also,  the  policy 
which  always  turns  the  server  off  (or  keeps  rim  off)  is  optimal  if  it 
is  unimprovable  and  if  its  value  function  is  finite-valued.  These 
results  will  be  used  implicitly  throughout  the  rest  of  this  section. 

A policy  7r  e ^ is  unimprovable  for  the  particular  semi-Markov 
decision  process  under  consideration  here  if 

(a)  v^ijO)  < vT,(i,0)  , 

(h)  vT( i,0)  < vij.,I(i,0)  , 

(c)  vT(i,l)  < v7J.t(i,l)  > 

(d)  v^(i,l)  < vir„(i,l)  > 

for  i e Nq,  where  tt’  and  tt"  are  the  same  policies  as  ir  except 
that  they  respectively  turn  the  server  on  (or  keep  him  on)  and  turn  him 
off  (or  keep  him  off)  at  the  first  decision  epoch. 

Let  a,  a and  co  be  defined  as  before. 
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Theorem  19:  If 


U q1  - h(i)  < 00  > 
ieN 

j\  ( f F(t)  e”^'*0:^tdt)(h(i+j)-h(i+j-l) ) < K-(l-oi)Rg  , 

JeNQ  Jo 

for  ieN, 

then  the  policy  which  always  turns  the  server  off  (or  keeps  him  off) 
is  optimal. 

Proof:  We  only  need  to  show  that  Tr(“,«>)  is  unimprovable,  since  the 

second  condition  of  the  theorem  guarantees  that  its  value  function  is 
finite-valued. 

Condition  (a)  holds  for  all  i e NQ,  since  + R2  - G’  Condition 
(b)  and  (d)  hold  trivially  for  all  i e Nq.  It  is  now  shown  that  con- 
dition (c)  also  holds  for  all  i e NQ.  Let  tt'  be  as  in  condition  (c). 
Then 

r-o:R 

v ,(0,l)  - v / .(0,1)  = — r 

7r(°°,«>)v  ; \-HX 

> 0 . 


Also 


00  j 

V(»  co'/M)  = R2  + 2 ( f ^JT  e'(a+X)tdt)'h(i+j),  for  ieN 

n ' ' OeNA  Jo 


and 
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oc 

XTT/+ ' 
1 \ “/ 


( / F^e^dt)  • | (K  - (1-0))R2) 


= K - ( l-co)R„  . 


Also 


2)  o\( i)  < 00  . 
ieN 

Thus  the  conditions  of  the  theorem  are  satisfied  and  the  corollary  follows 
directly. 

Q.E.D. 

We  will  need  to  indicate  the  dependence  of  the  value  function  of 

each  policy  on  the  start-up  and  shut-down  costs.  Therefore,  for  each 

7T  e a e R,  b e R,  let  v . denote  the  value  function  of  policy 

vr,  given  that  the  start-up  cost  is  a and  the  shut-down  cost  is  b. 

For  each  it  e S,  let  u_  and  w be  the  functions  from 

tt  7 r 

Nq  x Co, 1)  into  R defined  by 


U7T  - vir,R1^-R1  ’ 


and 


W =r  V 

7 r-  7t,-r2,r2 


As  will  be  seen  later,  these  functions  will  be  quite  useful  in  the 
following. 


Lemma 


21:  If  tt  is  a policy  (in  ^ ) which  always  turns  the  server 


on  (or  keeps  him  on)  when  the  number  of  customers  in  the  system  is  greater 
than  or  equal  to,  say  m,  and  if  in  addition, 


62 


(m,U  , 
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5. 
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b 
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VB'1)  - UTr(m,m+l) 

f°r  1 > “ ’ 

then  7T  satisfies  the  conditions  (a),  (b),  (c)  and  (d)  for  i > m. 

Proof:  The  conditions  (a)  and  (c)  are  trivially  satisfied  for  i > m. 

We  now  show  that  condition  (b)  is  satisfied  for  i > m. 

Observe  that  condition  (b)  is  equivalent  to 

V1'1)  S 'V(i,i+l)(i’1)  ’ 

for  i > ra.  We  prove  that 

yiA) < u7r(i4+i)(i>1)^  f°r  1 ^ 

by  induction  on  i.  The  above  inequality  holds  trivially  for  i = m. 
Suppose  that  it  has  been  proven  to  hold  for  some  i > m.  Then 

S u7r(i,i+l)(i,1)  ’ 

or  equivalently 

Using  this  together  with  the  last  assumption  of  the  lemma,  we  obtain 

V1*1'1*  s Vm*«<w,1> 

i Vl+l4+2)(i+1’1)  • 

This  completes  the  induction  proof,  and  condition  (b)  is  satisfied  for 
i > m. 
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That  condition  (d)  holds  for  i > m,  is  seen  as  follows.  From  the 


above  results, 


Vj/ijO)  = yi,l)  + H-l 

1 


= v^,(i,l)  - R2'  f0r  1 - m * 


This  implies  that 


vf(i,l)  < - (\  + R2) 

< v „(i,l),  for  i > m , 


since  R + R2  > 0.  This  completes  the  proof  of  the  lemma. 


Q.  E»  D. 


Let  m'  and  m"  be  the  smallest  numbers  in  N U C00}  such  that 


\(-l,m')(0’0)  2 vir(-l,m)(0’0)'  for  m£V 
Vo,»")(0’0)  - \(0,m)(°’0)'  for  m £ *0  • 


That  m’  and  m"  exist,  follows  from  the  fact  that 


lim  vtt(-1^)(0,0)  = V7r(-l/»)(°'0)  ' 


m -*  «° 


lim  vir(o,m)(0'o)  * \{0,”)(0’0)  ' 

m -*  e»  ' 


Also  notice  that 


V7t(-1A)(0,0)  - v7r(-l,0)(0'0)  ' 


since 


V-i,0)<0’0)  ' V-!,l)(0'0)  ‘ TPeT 


Lemma  22:  If  v is  a policy  (in  .£)  ) which  always  turns  the  server  on 
(or  keeps  him  on)  when  the  number  of  customers  in  the  system  is  greater 
than  or  equal  to,  say  m(m  > 0),  and  if  in  addition, 


v ,(m-l,0)  < min(v  (m-1,0),  v „(m-l,0)}  , 
/r  ii  if 


where  v'  and  7r"  are  the  same  policies  as  ir  with  the  only  exception 
that  they  do  not  turn  the  server  on  in  the  states  (m,0)  and  (m+l,o), 
respectively,  then 


Proof:  Clearly 


is  equivalent  to 


is  equivalent  to 


\t  (ro-1,0)  < v^m-1,0) 


(m-1,0)  < V^, (rn-1,0) 


V(m'D  - uir(a,m+l){m,1) 
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Combining  these  results,  the  lemma  follows. 


Q,»  E.  D. 


Let  f and  g be  the  two  functions  from  into  R given  by 

f(">  - - vIr(-l,m)(l"’1)’  for  " £ Mo  > 


*<">  ■ - Vo,®)0”*1’  ’ for  " E "o 


Lemma  23;  If  there  is  a k such  that 


ana 


U7r(m-l,m)^m*^  " U7r(m,m+l) 


> 0,  for  i < k , 
< 0,  fo»*  i > k , 


then  the  conditions  (a),  (b),  (c)  and  (d)  are  satisfied  for  i > m'  and 
i > m"  for  v = ir(-l,ra')  and  7T  = 7r(o,m"),  respectively. 

Proof ; By  Lemma  22, 


uir(m-l,«)t"'l!  - f0r 


m = m*  and  m = m" 


Using  this  together  with  the  condition  of  the  lemma,  we  obtain 


V(o-i,«)(m,1)  -Vm,®.!)1”’11’  for 


By  Lemma  22, 


and 


f(m’)  > 0 , 
g(m")  > 0 . 


Thus,  we  can  use  Lemma  21  to  obtain  that  the  conditions  (a),  (b),  (c) 
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Lemma  2!i ; Under  the  condition  of  Lemma  23,  the  conditions  (a)  and  (b) 
are  satisfied  for  all  i e for  both  it  = 7r(-l,m')  and  7T  = 7r(0,m"). 

Proof:  Follows  directly  from  Lemma  23  and  the  definition  of  m*  and 
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jP  iHP 


can  be  shown  in  a quite  similar  manner.  That 

g(0)  < o , 

follows  from  the  fact  that  + R2  — °*  Suppose  that  we  have  proven 
that 

g(m)  < 0,  for  some  m < m"  - 1 . 


This  implies  that 

'V(m,nH-l)('n+1'1)  S V(0,«1)(*'W)  ’ 

But  by  the  condition  of  the  lemma, 

U7r(m,m+l)  > 'V(m+l,m+2)^m+1,1;  (m  < m -1) 

Thus 


V»*i,»^)(,+1,:L)  - Vo,.*i)K1)  ’ 

or  equivalently 


g(m+l)  < 0 . 


This  completes  the  induction  proof.  The  last  assertion  of  the  lemma 
follows  trivially  from  the  definition  of  m1  and  m". 

Q,.  E.  D. 


Lemma  26:  If 

V-i.»,)(i'1)  for  i<m’' 

and  there  is  a k such  that 
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TO 


< 


m 


Suppose  that  we  have  proven  that 

w , . . . \ ( i * l)  < v , . x (i,l) , for  some  1 < i 
Tr(i-l,i)v  ’ - ir(i,m)v  7 

This  implies  that 


w ,.  , .x(i,l)  < v . . \(i,l)  . 

ir(i-l,i)v  7 - 7r(i-l,m)  * ’ 


which  is  equivalent  to 

w /.  . Ji-1,1)  < v, . v(i-l,l)  . 

7T(l-l,  1 ) ' 7 - 7T(l-l,m)v 

Using  the  condition  of  the  lemma,  we  obtain 

w / . _ . ...(i-l.l)  < v, . , ,(i-l.l)  . 

7r(x-2,i-l)v  7 - 7r(i-l>mr 

This  completes  the  induction  proof. 

Lemma  28;  If 

VTr(0,m")(0,l)  - ’ 

”ir(i-l,i)(i’1)  -”ir(i,i+l)(i,1)’  f0r  lE”' 

and  if  there  is  a k such  that 


U-r(i-l}i)^,:L)  “ UTT(i;i+l)^ 1,1)1 
then  Tr(0,m")  is  optimal. 


> 0,  for  i < k , 
< 0,  for  i > k , 


Q.  E.  D* 


Proof:  By  Lemmas  23  and  2k,  we  only  need  to  show  that  conditions  (c) 

and  (d)  hold  for  i < m". 
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Condition  (c)  holds  trivially  for  i > 0.  It  also  holds  for  i = 0 


f- 

w 

b- 


by  the  first  assumption  of  the  lemma.  Condition  (d)  holds  for  i < m" 
by  Lemma  27.  Thus,  Tr(0,m")  is  unimprovable  and  optimal. 

Q.E.D. 


Theorem  29:  If  m"  is  finite,  if 


'V(i-l,i)(i,1)  for  ieB 


and  if  there  is  a k such  that 


U7r(i-l,i) 


(i,l) 


UTr(i,i+l)^x,:i^ 


> 0, 
< 0, 


for  i < k , 
for  i > k , 


then  there  is  a natural  hysteretic  policy  which  is  optimal,  and  it  has 
the  following  characterization. 

If  m’  < m",  then  ^r(-l,m,)  is  optimal.  If  m"  < m’,  then 
7r(0,m")  is  optimal.  If  m1  = m",  then  7r(-l,m’)  or  7r(0,m")  is 
optimal  according  to  which  of  the  two  policies  minimizes  vt(0,0). 

Proof:  Consider  the  policy  7r(0,m").  If 

V7T(0,m")(0,1)  - v7r(-l,m”)(0,1)  » 
then  7r(0,m")  is  optimal  by  Lemma  28.  If 


Vir(o,m")(0,1)  > VTr(-l,m")(0,l)  ’ 

then  m'  is  finite  and  7r(-l,m')  is  optimal.  We  now  prove  this 
assertion. 

Therefore,  assume  now  that 
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V(0,m’)(0,l)  > V(-l,ra")(0,l) 


This  is  equivalent  to 


v /.  in(m",l)  > v , , „v(m",l)  . 

F(0,m  ) ' ’ Tr(-l,m  ')v  * 


By  Lemma  25, 


■'  - min(m  € N ^(”)|u7r(m,«+l)(m,1>  - V-l,«)<"’1>1 


m"  = minfm  € Nlu^^j  (m,l)  > v^.,  (■,!))  • 

Thus  m*  is  less  than  or  equal  to  m",  and  thus  it  is  finite. 
By  Lemma  27, 

Vo,.')*1'11  ^ Vm')*1'11,  for  0 < 1 *= ■ 


This  leads  to 


'V(-l,m')(i'l)  - Vf-l,®")*1’11 

s Vo,«")(1,1) 

iTl(i,«')(Wl’  f°r  0<i<B’  ' 

Thus,  the  conditions  of  Lemma  2 6 are  satisfied,  and  we  can  conclude  that 
TT(-l,m' ) is  optimal. 

Therefore,  if  7r(-l,m')  is  optimal,  then  m1  < m".  Suppose  that 
7r(-l,m?)  is  not  optimal.  Then 
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Va.m')0"'’11  -vir(-l,m,)(l"',1)’  for  some  » 1 i < ■>'  • 


Using  Lemma  27,  we  obtain 


v7r(0,m,)(",>1)  ^ vr(-l,m,)(m'’1) 


Therefore,  m"  is  less  than  or  equal  to  m*  by  Lemma  25.  This  com- 
pletes the  proof  of  the  theorem. 

Q.E.D. 

Lemma  30:  If  there  is  an  e > 0 such  that 


h(i+l)  - h(i)  >|  (K  + (l-u))R1)  + c,  for  i e KQ  , 


then  m"  is  finite. 


Proof:  Suppose  first  that 


h(i+l)  - h(i)  = | (K  + (l-co^)  + e,  for  i e NQ  , 


and  let  m"  denote  the  value  of  m"  for  this  case.  From  Section  3.2, 
o 

we  know  that  m"  is  finite, 
o 

Consider  now  the  general  case  where 


h(i+l)  - h(i)  > (K  + ( l-co)R^)  + e,  for  i e Nc 


Clearly 


U7r(m",m"+i)(mo>:i)  ^ VTr(0,m")(no'l)  ' 


0 0 ' 0 
since  the  number  of  customers  in  the  system  is  always  at  least  as  large 
when  7r(m,m+l)  is  used  as  when  7r(0,m)  is  used  (for  each  m). 


74 


If 


h(i+l)  - h(i)  = -J-  (K  + (l-co)R1),  for  i e KQ  , 


then 


uir(i-l,i)(1,1)  • Vi,W)(i,1)  = °’  f0r  1 E ” • 

Therefore,  in  the  general  case, 

- 'Vi,w)(1,1)  - °’  for  1 5 B ’ 


since  the  number  of  customers  in  the  system  is  always  at  least  as  large 
when  7r(i,i+l)  is  used  as  when  7r(i-l,i)  is  used  (for  each  i e N). 
Therefore,  we  can  use  Lemma  25  to  conclude  that  m"  is  less  than 


or  equal  to  m".  Thus,  m"  is  finite. 


Q.  E • D* 


Lemma  51:  If  there  is  an  e > 0 and  an  n < » such  that 

a 


h(i+l)  - h(i)  > ^ (K  + (l-cu)R^)  + e,  for  i > n , 


if 


“irfi-l,!)'1'11  -ttir(i,i+l)(i’1)’  f°r  i£“' 

and  if  there  is  a k such  that 


UTr(i-l,i)^1,;L^  ‘ u7r(i,i+i)(i'1^ 


> 0,  for  i < k , 


< 0,  for  i > k , 

v-  - 


then  m"  is  finite. 
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Proof;  Let  be  the  smallest  integer  such  that 

v , n\(k,o)  < v , x(k,o),  for  m > n . 

Tr(p,mpv  - 7r(n,m)v  ’ ’ 

Consider  now  the  queueing  system  where  the  holding  cost  function  h(i) 

has  been  replaced  by  the  holding  cost  function  h(i+n).  Let  m"  be  as 

o 

in  Lemma  50  for  this  system.  Then  nr  is  finite,  and 


_ tf  ti  _ 

m - m_  - n . 
o 1 


This  implies  that  m!^  is  finite. 


By  Lemma  27, 


„\(n,l)  < v,  „\(n,l)  . 
ir(0,mpx  ’ ' - 7T(n,m!|')v  ’ 


This  implies  that 


- v7r(0,mj)<“i’1)  • 


Using  Lemma  25,  we  conclude  that  m"  < m^.  Thus  m"  is  finite. 


Q.E.D. 


Theorem  52:  If  h is  convex,  if 


h(i+l)  - h(i)  >-  (K  - (1-u))R2),  for  i e NQ  , 


and  if 


h(i+I)  - h(i)  > — (K  + ( l-cw)R^) , for  some  i e NQ  , 


then  the  condition  of  Theorem  29  hold  and  there  is  a natural  hysteretic 
policy  which  is  optimal. 


Proof:  It  is  easy  to  sec  that 

u , . . , > (i,l)  - u , . , . « ( i.l) 

7r(x  , 1+1)  v ’ W{  l-l,l) 

is  a non-decreasing  function  of  i,  since  h is  convex.  Thus  the  last 
condition  of  Theorem  29  holds. 

If 

h(i+l)  - h(i)  = | (K  - (1-o>)r2),  for  i e NQ  , 

then 

“ir(i-l,i)(i,1)  = “irfi.itt)'1’11'  f°r  1 E ” ’ 

Therefore , 

h(i+l)  - h(i)  > “ (K  - (l-o>)Rg),  for  i e NQ  , 

implies  that  the  second  condition  of  Theorem  29  is  satisfied. 

By  Lemma  31,  the  first  condition  (mn  < »)  of  Theorem  29  also 
holds.  Thus,  the  theorem  follows. 

Q.  E«  D. 

Theorem  33:  If  there  is  an  e > 0 such  that 

h(i+l)  - h(i)  (K  + (l-co)R1)  + e,  for  i e NQ  , 

the  conditions  of  Theorem  29  are  satisfied  and  there  is  a natural  hys- 
teretic  policy  which  is  optimal. 

Proof:  Since  R7  + Rg  > 0, 

h(i+l)  - h(i)  (K  - (l-cojRg),  for  i e NQ  . 
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As  it  was  shown  in  the  proof  of  Theorem  52,  this  implies  that  the  second 
condition  of  Theorem  29  holds. 

As  it  was  shown  in  Lemma  50, 

h(i+l)  - h(i)  >|  (K  - (l-a>)B1),  for  i e NQ  , 

implies  that  the  last  condition  of  Theorem  29  holds.  By  the  same  lemma, 
the  first  condition  (m"  < “)  of  Theorem  29  holds.  Thus,  the  theorem 
follows . 

ft. E.D. 

Blackburn  (1971)  studied  the  case  where  the  holding  cost  function 
is  convex.  His  Theorem  14  (in  Chapter  5)  is  equivalent  to  our  Theorem  53 
with  the  exception  that  we  do  not  require  the  holding  cost  function  to 
be  convex.  When  we  do  require  that  the  holding  cost  function  is  convex, 
in  Theorem  32,  the  other  conditions  are  made  considerably  weaker. 

Deb  (1976)  obtained  a result  almost  identical  to  our  Theorem  33  in 
an  independent  study  of  a related  problem. 

We  will  now  give  an  efficient  algorithm  for  computing  m'  and  m" 
when  they  are  finite.  The  only  requirement  is  that  f(m)  and  g(m)  can 
be  calculated  efficiently  for  various  values  of  m.  For  the  cases  where 
the  holding  cost  function  is  a quadratic  or  exponential  function,  closed 
form  expressions  can  be  obtained  for  f and  g.  The  same  is  true  for 
the  case  where  the  service  times  are  constant  or  have  an  Erlang  distri- 
bution. If  the  holding  cost  function  has  a linear  tail,  nearly  closed 
form  expressions  can  be  obtained  for  f and  g.  In  the  following,  we 
assume  that  f and  g can  be  evaluated  efficiently  at  all  relevant 
points.  Since  the  algorithm  is  the  same  for  both  m*  and  m",  we  shall 
only  outline  it  for  m'. 
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Therefore,  suppose  m’  is  finite.  The  following  algorithm  finds 

m*. 


Step  1 : 

Let  m 

= 1. 

Step  2: 

While 

f(m)  < 0, 

double  m. 

Step  5: 

Let  m 

= m/2  and 

31 

It 

3 

Step  4: 

While 

m - m > 1, 

do  the  following: 

Let  m = (m  + m)/2. 

If  f(m)  > 0,  then  set  m = m. 
Otherwise,  set  m = m. 


Step  5:  Let  m’  = m. 

That  this  algorithm  really  finds  m’  follows  from  Lemma  25.  The  algorithm 
is  essentially  a bisection  method,  and  it  requires  only  2 log^m*  function 
evaluations. 

We  close  this  section  by  mentioning  that  most  of  the  results  in 
this  section  can  be  easily  extended  to  cover  hysteretic  policies  which 
are  not  necessarily  natural.  In  particular,  an  algorithm,  similar  to  the 
one  above,  can  be  constructed  so  that  it  finds  the  optimal  value  of  both 
n and  m in  Tr(n,m),  where  n is  not  necessarily  less  than  or  equal 
to  zero. 


79 


REFERENCES 

Baiachandran,  K.  R.  and  Tijms,  H.  C* , "On  the  D-Folicy  for  the  m/g/1 
Queue,"  Mgt.  Sci.,  21,  1073-1076. 

Bell,  C.  (1971),  "Characterization  and  Computation  of  Optimal  Policies 
for  Operating  an  m/q/1  Queueing  System  with  Removable  Server," 

Cper.  Res.,  19,  20S-216. 

Bell,  C.  (1973),  "Efficient  Operation  of  Optimal  Priority  Queueing 
Systems,"  Qper.  Res.,  21,  777-786. 

Blackburn,  F.  (1971),  "Optimal  Control  of  Queueing  Systems  with  Inter- 
mittent Service,"  Tech.  Rep.  No.  8,  Department  of  Operations  Research, 
Stanford  University. 

Blackburn,  F.  (1972),  "Optimal  Control  of  a Single  Server  Queue  with 
Balking  and  Reneging,"  Mgt.  Sci. , 19,  297-313. 

Deb,  R.  (1976),  "Optimal  Control  of  Batch  Service  Queues  with  Switching 
Costs,"  Advances  in  Applied  Probability,  8,  I77-I9U. 

Gebhard,  p.  F.  (1967),  "A  Queueing  process  with  Bilevel  Kysteretic  Service 
Rate  Control,"  Naval  Res.  Log.  Quart.,  14,  No.  1. 

Gross,  D.  and  Harris,  C.  M.  (1974),  Fundamentals  of  Queueing  Theory, 

John  Wiley  and  Sons,  Inc. 

Reyman,  D.  (I96S),  "Optimal  Operating  Policies  for  m/g/1  Queueing  Systems," 
Qper.  Res.,  l6,  362-382. 

Levy,  Y.  and  Yochiali,  U.  (1975)j  "Utilization,  of  Idle  Time  ir.  an  m/g/1 
Queueing  System,"  i-'igt.  Sci.,  22,  202-211. 


60 


OrKenyi,  P.  (1976),  "A  Theory  for  Semi ~Markov  Decision  Processes  with 
Unboui  ded  Costs  and  Its  Application  to  the  Optimal  Control  of 
Queueing  Systems,"  Tech,  Rep.  No.  64  and  No. 38  , Department  of 
Operations  Research,  Stanford  University. 

Prabhu,  N.  U.  and  Stidham,  Jr,  S.  (1973) , "Optimal  Control  of  Queueing 
Systems,"  in  Mathematical  Methods  in  Queueing  Theory,  Conference 
at  Western  Michigan  University,  May  10-12. 

Reed,  C.  (1974a),  "Difference  Equations  and  the  Optimal  Con-*rol  of  Single 
Server  Queueing  Systems,"  Tech.  Rep.  No.  23,  Department  of  Operations 
Research,  Stanford  University. 

Reed,  F.  C.  (1974b),  "The  Effect  of  Stochastic  Time  Delays  on  Optimal 
Operating  Policies  for  m/g/1  Queueing  Systems  with  Intermittent 
Service,"  Tech.  Rep.  No.  45,  Department  of  Operations  Research, 
Stanford  University. 

Sobel,  M.  F.  (1969),  "Optimal  Average  Cost  Policy  for  a Queue  with  Start- 
Up  and  Shut-Down  Costs,"  Oper.  Res . , 17,  No.  1,  145-162. 

Tijms,  H.  C.  (1973) , "Optimal  Control  of  the  Workload  in  an  M/g/1 
Queueing  System  with  Removable  Server,"  Report  Math.  Centre, 
Amsterdam. 

Wilde,  D.  F.  ar.d  Beightler,  C.  S.  (1967),  Foundations  of  Optimization, 


Prentice-Hall,  Englewood  Cliffs. 

Yadin,  M.  and  Naor,  P.  (1963),  "Queueing  Systems  with  a Removable  Service 


Station,"  Oper.  Res.  Quart. , 15,  No.  4,  393-405’ 


u~v 


APPENDIX  A:  Definition  of  Basic  Symbols 


a = interest  rate. 


00 

7 = / t dF(t)  = second  moment  of  the  service  time. 

= / tdG(t)  = expected  length  of  the  start-up  time. 

Jo 


r 1 = / t dG(t)  = second  moment  of  the  start-up  time. 


X = arrival  rate. 


CO 

p = ( j tdF(t))  = service  rate. 

Jo 

f*°°  -crt 

1=1  e dG(t)  = Laplace  transform  of  the  start-up  time. 


p = - = load  on  system. 

r 


c = — - = Laplace  transform  of  the  inter-arrival  times. 

A."KX 


-(ct+7  -Mr)t 


dG(t). 


e"^^  ^V^dF(t)  = Laplace  transform  of  busy  period. 

f*  -ext 

od  = / e dF(t)  - Laplace  transform  of  the  service  time. 

Jo 

F = cumulative  disbribution  function  for  the  service  times. 
G = cumulative  distribution  function  for  the  start-up  times. 
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APPENDIX  C:  The  Laplace  Transform  of  the  Busy 

Period,  in  the  M/E^/l  and  M/l)/l 
Queueing  Systems 


This  appendix  contains  graphs  showing  the  Laplace  transform  of  the 
Busy  period  in  the  m/e^/I  and  the  m/ d/,1  queueing  systems.  The  busy 
period  is  defined  as  the  time  from  a customer  arrives  (to  an  empty  system) 
until  the  system  becomes  empty  again.  The  parameter  of  the  transforms 
is  denoted  by  a.  As  before 


P = £ ' 


where  \ is  the  arrival  rate  and  p is  the  service  rate.  The  Laplace 


transform  is  denoted  by  ijr. 

For  the  m/e^/I  queueing  systems,  + is  giver,  by 


* = ir;r. ^ 1 * for  k e N * 


For  the  M/D/l  queueing  system,  is  given  by 


t = e 
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report  considers  t\e  M/G/l  queueing  system  with 
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hysteretic  policies  are  introduced.  It  is  shown  that  there  is  a 
natural  hysteretic  policy  which  is  average  optimal,  and  that  if 
the  start-up  times  are  instantaneous  or  the  holding  cost  function 
convex,  then  there  is  a natural  hysteretic  policy  which  is  undis- 
counted optimal.  When  discounting  is  used,  the  results  are  not  as 
strong,  except  for  the  case  where  the  holding  cost  function  is 
linear.  For  the  non-linear  case  we  still  obtain  certain  fairly 
weak  sufficient  conditions  for  a natural  hysteretic  policy  to  be 
optimal.  <n 


UNCLASSIFIED 


HCVMTV  CLASSIFICATION  OS  THIS  0 AOifOftM  IM  SMWmQ 


